Impenetrable Thoughts
Keyboard Layout Optimization

It is well known that the standard keyboard layout, QWERTY, is sub-optimal and requires not only more effort for an average piece of typing, but also makes a greater than necessary use of weaker fingers. The most well known improvement, the DVORAK layout, was created before the existance of computers. DVORAK is a good start, but is necessarily imperfect due to limitations in the text available for analysis at the time and the time required for the labour intensive analsys.

Now I am not the first to consider using the power of a modern computer to create a more efficient keyboard layout. In fact there are many examples of those who have come before me. Some examples of more modern layouts or optimizers include: Colemak, Asset, Carpalx, Programmers DVORAK and a few others which I cannot find links to at the moment. Worth special mention is the MTGAP Page which nicely analyzes and summaries the work of a lot of the former work.

Now as I am not the first to attempt this I am also not the first to release their keyboard optimization software to the world. However, none of the software I could find fit my particular requirements. The first is that I am not using a standard slab keyboard and would like my layout to be optimized for the keyboard I actually use, a Kinesis Advantage. Now this keyboard isn't shaped like most ergonomic keyboard by taking a slab keyboard and splitting them.

The second reason is that many of the layouts seem to believe that it is worthwhile to constrain the locations of the ZXCV keys in order to make the common Windows/Mac shortcuts remain easy to use. This assumes that the control key is kept in the PC location; which I do not do, I swap control and capslock). It also assumes that these shortcuts are used a lot. Now this might be true of other people, but as a VIM user I find that I use them infrequently and consequently don't want my layout made less optimal in because of them.

I also wanted to make use of a couple of interesting ideas I found while researching better keyboard layouts such as moving the symbols anywhere throughout the keyboard if they make it better and using a using a dead key to produce programming symbols (the link I have is now dead). As I am a programmer I also find it much more common to use the symbols that would be normal and so in the least I would consider is a good thing if the symbols were moved to the unshifted key position. In all none of the software I found was trivial to change the definition of a keyboard or appeared simple to add the dead key options. So I wrote my own.

The first major difference in my software other than taking the above features into account is that I chose to use a different search method. Whereas others used genetic evolutionary searches, or Monte-Carlo methods I went with a plain k-beam hill climb with periodic culling of the lowest few percent and replacing them with copies of the best handful. The other major change is that while others seem to have decided for reasons of speed to operate on statistical approximations of actual text (they tended to operate on either words or character triples) I chose to compute the actual cost of typing a given text. This is significantly slower, but I feel that it will better compute the cost. I believe this to be especially true as I dabble with more interesting options such as the aformentioned dead keys which produce a wider range of character combinations, such as "+= " or "){\n" which change the spacing of words.

So I currently use the dead keys to produce most of the comment two character programming combinations including +=, ++, --, >>, ==, != and others. I haven't yet finished getting fine tuning the parameters of the optimization and so have no results to show yet, but I am releasing the software as it stand anyways.

My first long run attempt to produce a layout for my Kinesis results in the following:

'[/}' 'Q/q' ''/4' '*/<' 'V/v' '%/&' '$/?' '6/;' 'K/k' '9/:' '/~ '+/\'
  \t  '2/5' 'Y/y' 'U/u' 'C/c' 'G/g' 'D/d' 'H/h' 'L/l' 'W/w' '//7' 'X/x'
 -=-  'I/i' 'A/a' 'E/e' 'O/o' 'P/p' 'M/m' 'N/n' 'T/t' 'S/s' 'R/r' '@/]'
 -=-  '=/>' '-/_' '"/0' ')/3' '#/1' 'B/b' 'F/f' ',/.' '(/8' 'Z/z'  -=- 
'^/|'  -=-   -=-   -=-   -=-  '!/{' 'J/j' '   '  -=-

Where -=- means a special key which can't be placed by the optimizer, such as shift. The bottom row is also special. The four special keys in a row are, by default, the arrow keys. I have elected not to move them because I find them well placed for the frequency of use. I have also added a key which doesn't actually exist on the bottom row to be the space key in order for it to be counted. The space key is on a thumb button and costs nothing because of that.

Another important point to keep in mind is that the dead key symbols, such as +=, are not places on the layout because not only are there far more keys than symbols, but there is also likely too few example within the corpus for an accurate optimization effort. Placing these symbols is a manual step to be taken afterwards. The output shows two faces for each key. There is no association determining which is shifted and which is unshifted. When one is typing correctly and using both shift keys the shift motion is mostly free. The only cases where it is not free is when the pinky on the opposite hand is being used in previous or next key. I estimated this effect to be minor and haven't implemented it.

Finally you will see that keys which do not have letters were allowed to have their faces swapped arbitrarily. Thus you are able to end up with a key which produces both 2 and 5, which obviously didn't happen on the originating keyboard layout.

This layout certainly looks odd, but is apparently approximately 30% better than DVORAK using the scoring function. DVORAK was used as the base keyboard layout to vary from. Notable features include the home row containing eight of the first nine most frequent letters in English. According to MTGAP this accounts for more than half the letters typed. This is obviously good because if you don't leave the home row at all then you are barely moving your fingers. It is also useful to note that in the home row the vowels are on the left and the consonants are on the right. This seems like it may be an artifact of starting with DVORAK, but it could be that it is hard to beat, especially when I am aiming for hand alternation in many cases.

As of this time I haven't yet made use of this layout. This is mostly because one of the secondary goals of this project was to produce a keyboard layout which I could implement entirely using the programmability features of my keyboard and thus be able to use any OS without any configuration. Unfortunately splitting up and moving symbols and numbers around requires a large number of macros on the keyboard and while the keyboard has a sufficient number for the layout, it is only barely so. In attempting to program this layout I also discovered that the functionality I was going to make use of in order to program dead keys is beyond the abilities of the keyboard. I have an alternative plan in mind, but because I had intended to use the comma as a dead key for this purpose the layouts have been optimized for a higher than normal number of comma key presses. I do now know how much of an issue this is as the comma symbol was not moved from the QWERTY position. Though I do find it odd that the comma and the period share a key.

Update: 2009-10-06

I tried this first computed layout for about a week and have decided to give up on it. Though it was becoming more comfortable I found the awkward placement of certain keys, namely k j q and the numbers, to be too difficult. I have reverted to QWERTY for the moment while I wait for another layout to be computed.

In either case I have decided that putting the symbols as the unshifted symbol on the keys to be a most worthwhile concept.

The Software

It is most a quick hack which is known only to run on Linux. It is written in C and should be mostly portable. A corpus is not included and should be representative both in content and proportion to the typing you actually do. In my case this involved a mix of code in several programming languages written in English and a large amount of English text from Project Gutenberg.

The latest release of my keyboard optimizer is available here.