FYI, on some new changes & coordination between XKB & input methods in
the upstream & Linux communities.

    -Alan Coopersmith-          alan.coopersmith at sun.com
     Sun Microsystems, Inc. - X Window System Engineering

-------- Original Message --------

Peter Hutterer: Re-designing input methods with XKB
<http://who-t.blogspot.com/2009/08/re-designing-input-methods-with-xkb.html>

via planet.freedesktop.org <http://planet.freedesktop.org> on 8/20/09

I've had an interesting meeting with Jens Petersen yesterday about input
methods. Jens is one of the i18n guys working for Red Hat.

Input methods are a way of merging several typed symbols into one actual
symbols. Western languages rarely use them (the compose key isn't quite
the same), but many eastern languages rely on them. To give one (made
up) example, an IM setup allows you to type "qqq" and converts it into
the chinese symbol for tree.

Unfortunately, IM implementations are somewhat broken and rely on a
multitude of hacks. Right now, IM implementations often need to hook
onto keycodes instead of keysyms. Keycodes are a numerical value that is
usually the same for a key (except when it isn't). So "q" will always be
the same keycode (except when it isn't). In X, a keycode has no meaning
other than being an index into the keysym table.

Keysyms are the actual symbols that are to be displayed. So while the
"q" key may have a keycode of 24, it will have the keysym for "q" in
qwerty and the keysym for "a" in azerty.

And here's where everything goes wrong for IM. If you listen for
keycodes, and you switch drivers, then keycode 24 isn't the same key
anymore. If you listen for keysyms and you switch layout, keysym "q"
isn't the same key anymore. Oops.

During a previous meeting and the one yesterday, we came up with a
solution to fix them properly.

Let's take a step back and look at keyboard input. The user hits a
physical key, usually because of what is printed on that key. That key
generates a keycode, which represents a keysym. That keysym is usually
the same symbol as what is printed on the keyboard. (Of course, there
are exceptions to that with the prime example being dvorak layout on a
qwerty physical keyboard)
In the end, IM should aim to provide the same functionality, with the
added step of combining multiple symbols into one.

For IM implementations, we can differ between two approaches:
In the first approach, a set of keysyms should combine to a final
symbol. For example, typing "tree" should result in a tree symbol. This
case can be fixed easily by the IM implementation only ever dealing with
keysyms. Where the key is located doesn't matter and it works equally
well with us(qwerty) and fr(dvorak). As a mental bridge: if the symbols
come in via morse code and you can convert to the correct final symbol,
then your IM is in this category. This approach is easy to deal with, so
we can close the case on it.

In the second approach, a set of key presses should combine to a final
symbol. For example, typing the top left key four times should result in
a tree symbol. In this case, we can't hook onto keysyms because they may
change with the layout. But we can't hook onto keycodes either because
they are essentially random.

Wait. What? Why does the keysym change with the layout?

*Because we have the wrong layout selected*. If you're trying to type
Chinese, you shouldn't have a us layout. If you're trying to type
Japanese, you shouldn't have a french layout. Because these keysyms
don't represent what the key is supposed to do. The keysyms are supposed
to represent what is printed on the keyboard, and those symbols are
Chinese, Japanese, Indic, etc. So the solution is to fix up the keysyms.
Instead of trying to listen for a "q", the keyboard layout should
generate a "tree" keysym. The IM implementation can then listen for this
symbol and combine to the final symbol as required.

This essentially means that for each keyboard with intermediate symbols
there should be an appropriate keyboard layout - just as there is for
western languages. And once these keysyms are available, the second
approach becomes identical to the first approach and it doesn't matter
anymore where the physical key is located.

The good thing about this approach are that users and developers can
leverage existing tools for selecting and changing between different
layouts. (bonus points for using the word "leverage") It also means that
a more unified configuration between standard DE tools and IM tools is
possible.

For the IM implementation, this simplifies things by a bit. First of
all, it can listen to the XKB group state to adjust automatically
whether IM is needed or not. For example, if us(qwerty) and traditional
chinese are configured as layouts, the IM implementation can kick in
whenever the group toggles to chinese. As long as it is on us(qwerty),
it can slumber in the background.

Second, no layout-specific hacks are required. The physical location of
the key, the driver, they all don't matter anymore. Even morse-code is
supported now ;)

Talking to Jens, his main concern is that XKB limits to 4 groups at a
time. This restriction is built into the protocol and won't disappear
completely anytime soon. Though XI2 and XKB2 address this issue, it will
take a while to get a meaningful adoption rate. Nonetheless, the
approach above should make IM for the large majority of users more
robust and predictable, without the issues coming up whenever hacks are
involved.

I think this is the right approach, Jens agrees and Sergey Udaltsov, the
xkeyboard-config maintainer too. So now we just need to get this
implemented, but it will take a while to sort out all the details and
move all languages over.

Reply via email to