Hi, > As to the idea of rewrite the xlib using iconv, is there a real working plan, > or is it just an idea?
I would say there are some thoughts what and how can be done but no one line of code yet. Fist of all, I would like to keep existent i18n modules for a fallback (anyway they are separate loadable modules and can coexist with new iconv-based module). It means the changes in other Xlib's code should be as less as possible. The thing is that Xlib's converters differs from iconv and can't be just replaced but we need to wrap iconv calls into Xlib's converters. The Xlib's converters operate with impersonal WideChar, MultiByte, etc. which meaning depends on the current locale whereas iconv operates with exact encoding names. But MultiByte everywhere in Xlib means 'encoding_name' which can be obatined from a locale name or some simple table like locale.dir and used at the converter creation. For WideChar we can take UCS2 or UCS4. Thus it is a simplest problem. The second problem is the CompoundText conversion. Standard iconv doesn't support even iso2022 for all charsets. But actually we have three cases for non-Unicode encodings. The non-standard encodings (charsets) is the simplest case. Their strings are being packed into 'extended segments' without any changes but we need to know a multibyte length or a complete esc-sequence for them. The second case is the set of standard single-byte encodings that are covered with one (strictly speaking - two) charset. They doesn't need any changes in strings but require a table of designators (esc-sequences). And the third case is CJK encodings that are represented in CTEXT with a few different charsets. But fortunately for them iconv has iso2022* encoding that is almost the same as CTEXT an can be easy converted (or rather corrected) to/from CTEXT. But the worst case is the conversion from Unicode to CTEXT. The same Unicode codepoint often can be coverted into many different charsets but will such CTEXT be accepted by Unicode-unaware application depends on the locale used in the application. It isn't CJK problem only. For example Russians still use four different charsets for the same alphabet (the ISO standard for cyrillic isn't widely used in Russia except commercial Unixes, in free Unixes the Cyrillic standard is 'koi8' come from Soviet Union standards, but there also is 'microsoft codepage 1251' that is often used in Unixes too). Therefore for this case we need some ordered list of 'preffered charsets' (or encodings that can be converted into CTEXT) that should be configurable for a separate locale. Thus anyway we need some 'locale object' that is not the same as XLCD but should keep 'encoding_name', en esc-sequence for non-standard encodings, the name of iso2022* iconv code for locales that need it and the list of preffered charsets for UTF-8 locales. > What I wnat to do with the conversion part is to make related functioms get > rid of XLCD binding. It's easy to achieve either by using iconv or by > reusing code from lcCT.c and lcUTF8.c. I don't see how lcCT.c and lcUTF8.c could help here. The lcCT.c functions are very oriented to existent XLCD data and it has unplesant restriction. It is able to convert CTEXT to multibyte only if charset used in CTEXT presents in the current locale XLCD. If there are two application using Cyrillic but on of them uses 'koi8' whereas the another one uses 'microsoft-cp1251' for lcCT it means they are talking on absolutely different languages. The lcUTF8.c module actually is iconv but with reduced set of supported encodings. > From CTEXT conversion has no problem. > To CTEXT conversion need charset selection, which can be solve by either > always use ISO10646-1 I don't think it is a way. If an application is Unicode-aware it should use UTF8_STRING for an inter-client communication. If some application doesn't know UTF8_STRING it is more likely it doesn't accept ISO10646 encapsulated into CTEXT. > or read preffered charset from localeDB without > trigger XLCD object constructed. Right. But somewhere we have to have such preffered charsets list per locale. > Input related part is most complicated part. merely considering XIM part, > protocol, imdkit and client side library all should be enhanced. But this > make things out of control. Changing client side library only and cheating > IM server at some point may be a temporary resolution. Maybe IIIMF should > became mainstream, but I think more research should be done on this point. Agree. > In fact, I want more here. I think three kind of input methods, namely > keyboard mapping, composing and IM server, should have a common point > to manage. A consistent switch method among different input methods should > be offered, like Windows does. I never use complex input methods in Windows. Where can I read about them or what I have to install on Windows box to see this 'consistent switch method'? > I noticed the recent discussion on composing > method in this maillist. What Kent Karlsson purposed is obviously coming > from Windows. But his suggestion can't be fulfilled within current mechanism. > I rember you worked to make keboard mapping and composing be synthesized on > X server side some time ago. No. Probably you mean I said somewhere that it would be good if Compose rules were a part of a keyboard map and be kept on a server with other XKB tables. :) But I didn't mean all mapping and composing should be performed on the server side. And didn't work on it because it require protocol changes that I don't dare to do yet. > But I think the right point shoud be on client > side. X server does too much on keyboard mapping. Mapping info and group > switching process should be put in client side, and be load and set per > client. What do you mean saying "too much"? Now the mapping itself is performed on the client side, let alone the composing. The server sends notifications (events) about key press/release but reports a keycode (scan-code) and a state of modifiers only but not a keysyms or Unicode chars as the mapping result. The symbols map kept in the server is not used by the server itself but an application can obtain it from such 'centralized repository'. In theory any application is free to use any other symbols map obtained from any other source. But Xlib doesn't provide an API for it. It silently loads the map from the server at the first call of XLookupString. And what is worse is that if some application wants another symbols map there is not way exept to load new map into the server and all other applications get a notification and reload own maps too. And anyway the side (client or server) where the mapping is processed is a detail of implementation only, that affects protocol but not an API. The more important thing is what we want from this API. I guess you want a flexibility that allows different applications have different symbols maps and use different 'input language engines'. But if a server could keep as many different maps as needed, remember what map for what client connection or window is used and switch them internaly with a focus changes, I think it is not worse than the client side mapping. And don't forget about an opposite side. Many people would like to choose a language (or a country) once at the system installation and never think about different keymaps, locales, etc. even if they run client applications from different machines. I remember someone complained that when he run clients from other computers they use the same keymap but the compose rules are different or even don't work for some clients. I suppose it is a serious reson to have a 'centralized repository' for keymaps and compose rules (and IMEs if it's possible). On the other hand it doesn't mean applications should not be able to load keymaps or connect IMEs from other sources. As for group switching I think there are differrent opinions here too. I wrote a small utility that is a keyboard group switcher and indicator. One of its main features is that it can remember an XKB group for each window separately and change them automaticaly with a focus moving. This feature is switched on by default. But I had questions from some users how to switch it off because they don't like to remember the group in each window but easy remember what is the current 'global' group state and want a simple indicator only. :) > In my ideal situation, an application written with new API only (I mean i18n > related) will use Unicode internal, and doesn't have XLCD object created. > However, old applications work just as they used to be. As I said if a group of applications uses Unicode internaly, UTF8_STRING for a communication, Xft for output and don't use complex IM methods (servers) they don't need any additional i18n API (except, maybe, keysym to UCS table). -- Ivan U. Pascal | e-mail: [EMAIL PROTECTED] Administrator of | Tomsk State University University Network | Tomsk, Russia _______________________________________________ I18n mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/i18n