Hi,

The original i18n part of Xlib is very iso-2022 oriented and lately added
Unicode support doesn't fit its design well.  It assumes that in every locale
the whole range of mutibyte codes can be divided to some non-overlaped
charsets (strictly speaking it usualy deals with GL/GR halfs of charsets)
and there is only one way of such division.  Each charset is fully covered
with one font (in the fonts encoding meaning).

The list of those charsets and the scheme of 'mb/wc codes range to particular
charsets' division is described in the locale description file.  Therefore
the task of XCreateFontSet is to find one font for each charset and the task
of the drawing routine at a single char drawing is to find what charset 
this mb/wc code match, convert the code to the glyph index and draw it using
the font prepared by XCreateFontSet.  Even if XCreateFontSet can find more
than one fonts for some charset, only first one is used because there is no
information why one of fonts is better than others for a single char.

But with UTF-8 locales this scheme doesn't work because
* Unicode (iso-10646) encoded font overlaps with any iso-2022 based font
* in UTF-8 locale a single mb char can be converted into more than one
  iso-2022 charset.
To solve this problem somehow the UTF-* module uses a list of 'preferred
charsets'.  One such list is builtin in Xlib (it is used in Xutf8* routines)
and also the module makes an additional list gathering charsets from the
locale description file in the order thy are mentioned in this file.
The 'supreset' iso-10646 can be used in this list too.  If it is the first one
in the list all other charsets are never tried.

> Problem 1
... 
> The Xmb routines never use any other fonts but the iso10646-1 helvetica 
> that does not contain all characters, because fonts are selected based on
> first matching character set and iso10646 seems to be preferred.

Now many UTF-8 locales actually share one description file.  Therefore this
common file is rather a template than a proper locale description file.
In this file iso-10646 encoding stands as the first one and all other charsets
are left there as examples of charsets that can be used there.
If you want the iso-10646 font is used as a 'last resort' you need to change
the order of fs's in the locale description file (XLC_LOCALE).

> Studying the Xlib source code, apparnetly Xmb in UTF8 locale selects fonts
> by the order of fonts in the fontset, that does not match the order of 
> fonts in the request pattern to XCreateFontSet. Shouldn't the pattern 
> matter? Shouldn't the glyphs actually implemented in a font matter, in
> particular in case of iso10646 fonts?

The thing is that the fontset creation and the string drawing are relatively
independed.  The font creation procedire uses the base_font_name_list
(the pattern) for the font finding only and doesn't store it for any other
usage.  The string drawing routine gets 'preferred charsets' list from
the locale description but not from the base_font_name_list.  I would not like
to change that because
* the pattern may doesn't have any order information (it can be something like
  "-adobe-helvetica-*") or this information can be incomplete.  Hence is isn't
  clear how to merge the orders got from the pattern and from the locale
  description.
* some locales (ja_JP, ko_KR, zh_TW, etc.) already have own description files
  where the 'preferred order' is already defined.  If apllications begin to
  ignore that order and use the order from the pattern it will be considered
  as unexpected and unwanted behavior.

> Example 2. Same as above, Xutf8 routines. These routines seem to prefer
> iso8859-1 encoding and indeed a fixed-width font with that encoding is 
> in the fontset. But the user wanted helvetica!

It can be a bug.  But I don't know all conditions.  Can you give some example
code or just tell what pattern and what locale were used in that case. 
 
> It seems Xutf8 routines select fonts from the fontset by a fixed list of
> encodings, filtered by the locale.
Yes. 

> Again, font select based on character set guess is wrong.
May be.
 
> Example 3. Same as above, remove iso10646-1 helvetica from system.
> Situtation is reversed. Xmb routines use a fixed-width font, Xutf8
> routines helvetica.

Give more details, please.  At least the pattern and the locale under which
it happens.

> Problem 2
> 
> All the drawing routines also have the problem of stopping processing the
> string when an encoding is selected for which there is no font in the 
> fontset, even if some font with other encoding might implement the wanted
> glyph.

Yes.  As I tried to explain it's the i18n module design.  In iso-2022 based
scheme it is impossible that in a single locale "some font with other encoding
might implement the wanted glyph".

> Once again, font selection should be based on the available fonts,
> not the first matching encoding and there's no need to skip the rest of 
> the string even if a font can't be found.

It requires deeper redesign of i18n Xlib's part.
But I think most of modern applications don't use Xlib's output methods for
i18n texts.  Therefore such redesign doesn't seem useful.

> Problem 3
> 
> This is unrelated to the above two problems, may not necessarily be a bug
> in Xlib and I can only say of the situation under Debian/unstable, gnu libc
> 5.4.46. The problem is that if there are multiple encodings listed for 
> otherwise the same locale in /etc/locale.gen and locale-gen has been used
> to generate whatever it does, libc mb routines always use the last encoding
> listed in that file while Xlib mb routines respect the LC_CTYPE setting. 
> XSupportsLocale succeeds. While libc may not do the right thing here, it 
> sets the environment and nl_langinfo(CODESET) can be used to obtain the 
> encoding expected by the libc mb routines.

The nl_langinfo call isn't quite portable.  For example is absent in FreeBSD
older than 4.6.

-- 
 Ivan U. Pascal         |   e-mail: [EMAIL PROTECTED]
   Administrator of     |   Tomsk State University
     University Network |       Tomsk, Russia
_______________________________________________
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Reply via email to