Markus Kuhn wrote: > Wasn't Xutf8LookupString supposed to be guaranteed to be locale > encoding *independent*? So why does it have to be implemented on top of > the (apparently) locale-dependent XLookupString? Sounds not entirely > kosher ...
Good question. Some points in X{mb|wc|utf8}LookupString are unclear for me too. When I was changing something there I tried to keep an existent logic. When those subroutines deal with Input Methods (including a 'local' IM which uses the Compose file) they pass the string got from IM without any additional recoding. But when the X{..}LookupString has to convert a single key event it calls the XLookupString to get a keysym according to a keycode and a modifiers set delivered in the event. There is nothing wrong there. But then it checks not the keysym only but a returned string too. Checking the string it distinguishes four cases (the order of checking can be different but it doesn't metter): 1. The XLookupString returned a zero-length string. In this case the calling procedure gets the keysym and converts it in own way. It's quite right and clear. 2. The XLookupString returned a one byte length string but the keysym isn't the ascii keysym. It's possible becouse even in the 'Latin1 only' mode there are kesyms (namely from the right side of the Latin1 set) which XLookupString can convert to string but in depending on the current locale encoding they have to be converted in another way. The X{..}LookupString considers this case as the previous one and does the same conversion. 3. The XLookupString returned a one byte and the keysym is the ascii keysym (and so the char is the ascii char too). In this case the X{..}LookupString simply returns this char without conversion. It seems like a bug. Although in most locales it the ascii chars don't need additional conversion (in UTF-8 too) there can be encodings where such chars need at least be prepended with some 'shift sequence'. Am I right? 4. And most unclear case is when the XLookupString returns more than one byte. In this case the calling procedure considers it as a 'compound text' encoded string and tries to convert it from the CT to the current locale encoding. I found only one case where the XLookupString operating in the 'Latin1 only' mode can return a 'more than one byte length' string. It's possible if some keysym are rebound to user defined strings by XRebindKeysym. The XLookupString passes such string to the output without any changes. And the X{..}LookupString, as I said above, treates them as the Compound Text and tries to convert using the 'from CT' converter. I wonder does anybody (somebody) uses this mechanism (I have not found). But if such API exists we need to keep it anyway. Isn't it? > > the X{mb|wc|utf8}LookupString family has a checking which discards > > non-ascii char outputed by XLookupString if it is only one > > Why is it necessary to distinguish between ASCII and non-ASCII > characters? Becouse non-ASCII characters must be converterd of course. And I still doubt is the ASCII characters a special case or they should be processed in the same way. -- Ivan U. Pascal | e-mail: [EMAIL PROTECTED] Administrator of | Tomsk State University University Network | Tomsk, Russia _______________________________________________ I18n mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/i18n