I've been looking some recently at using UTF-8 locales with Xlib's i18n code, and the conclusion I seem to be coming to is that the contents of nls/XLC_LOCALE/en_US.utf8 are entirely and completely bogus.
What is in there is something like XCOMM fs0 class fs0 { charset { name ISO10646-1 } font { primary ISO10646-1 } } XCOMM We leave the legacy encodings in for the moment, because we don't XCOMM have that many ISO10646 fonts yet. XCOMM fs1 class (7 bit ASCII) fs1 { charset { name ISO8859-1:GL } font { primary ISO8859-1:GL vertical_rotate all } } [...] XCOMM fs6 class (Half Kana) fs6 { charset { name JISX0201.1976-0:GR } font { primary JISX0201.1976-0:GR vertical_rotate all } } END XLC_FONTSET So, we first list iso10646-1, followed by various legacy encoding, to act as fallbacks. I've long known that having iso10646-1 first in this list means that the fallbacks won't be used if there are any iso10646-1 matching the fontsets, since Xlib i18n considers all fonts to be encoded "solid". So, the iso10646-1 font is used even if there are no But I realized today that in fact, with this XLC_LOCALE file, the fallback fonts do no good in any case whatsoever. The reason why is that in the actual rendering process, the set of fonts present does not affect what character sets are chosen for conversion. So, even if there is no iso10646-1 font present, the utf-8 string will still be converted into Unicode-based glyphs, and then these glyphs will be rendering with whatever font is present, producing junk on the screen. There are various bugs in the 'omGeneric' code that make this junk worse than it needs to be, but that doesn't particularly matter ... if we only have iso10646-1 fontset in the locale, then if it fails to load, the fontset will fail to load, which provides maximal information to the application. My recommendation, then, for the UTF-8 locale files, is that for locales where iso10646-1 is a reasonable font encoding, we should point to a en_US.UTF-8 locale that has only iso10646-1 and nothing else. And for other locales (CJK languages), we should have separate UTF-8 XLC_LOCALE files that list the language's encoding first, followed by 10646-1 afterwards. Aside from some major reworking of the Xlc code, this seems to be the shortest approach to getting reasonable results. Regards, Owen _______________________________________________ I18n mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/i18n