Markus Kuhn wrote:
> Wasn't Xutf8LookupString supposed to be guaranteed to be locale
> encoding *independent*? So why does it have to be implemented on top of
> the (apparently) locale-dependent XLookupString? Sounds not entirely
> kosher ...

  Good question. Some points in X{mb|wc|utf8}LookupString are unclear for
me too.  When I was changing something there I tried to keep an existent
logic.

  When those subroutines deal with Input Methods (including a 'local' IM
which uses the Compose file) they pass the string got from IM without any
additional recoding.  But when the X{..}LookupString has to convert a single
key event it calls the XLookupString to get a keysym according to a keycode
and a modifiers set delivered in the event.  There is nothing wrong there.
  But then it checks not the keysym only but a returned string too.
  Checking the string it distinguishes four cases (the order of checking
can be different but it doesn't metter):

1. The XLookupString returned a zero-length string.  In this case the calling
procedure gets the keysym and converts it in own way.  It's quite right and
clear.
2. The XLookupString returned a one byte length string but the keysym isn't
the ascii keysym.  It's possible becouse even in the 'Latin1 only' mode there
are kesyms (namely from the right side of the Latin1 set) which XLookupString
can convert to string but in depending on the current locale encoding they
have to be converted in another way.  The X{..}LookupString considers this case
as the previous one and does the same conversion.
3. The XLookupString returned a one byte and the keysym is the ascii keysym
(and so the char is the ascii char too).  In this case the X{..}LookupString
simply returns this char without conversion.  It seems like a bug.  Although in
most locales it the ascii chars don't need additional conversion (in UTF-8 too)
there can be encodings where such chars need at least be prepended with some
'shift sequence'.  Am I right?
4. And most unclear case is when the XLookupString returns more than one byte.
In this case the calling procedure considers it as a 'compound text' encoded
string and tries to convert it from the CT to the current locale encoding.
I found only one case where the XLookupString operating in the 'Latin1 only'
mode can return a 'more than one byte length' string.  It's possible if some
keysym are rebound to user defined strings by XRebindKeysym. 
The XLookupString passes such string to the output without any changes.  And
the X{..}LookupString, as I said above, treates them as the Compound Text
and tries to convert using the 'from CT' converter.
  I wonder does anybody (somebody) uses this mechanism (I have not found).
But if such API exists we need to keep it anyway.  Isn't it?

> > the X{mb|wc|utf8}LookupString family has a checking which discards
> > non-ascii char outputed by XLookupString if it is only one
> 
> Why is it necessary to distinguish between ASCII and non-ASCII
> characters?

  Becouse non-ASCII characters must be converterd of course.  And I still
doubt is the ASCII characters a special case or they should be processed
in the same way.

-- 
 Ivan U. Pascal         |   e-mail: [EMAIL PROTECTED]
   Administrator of     |   Tomsk State University
     University Network |       Tomsk, Russia
_______________________________________________
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Reply via email to