iso10646 in the font name means Unicode. It doesn't mean utf8, since that's
just a way of compressing a sequence of 32-bit Unicode code points into a
sequence of 8-bit bytes.

What we see happening in xboard is that something (maybe Xaw, but I'm not
100% sure) does not understand that the string it is trying to display is in
utf8. So whenever there is a character in the utf8 stream that takes up more
than one byte, instead of repacking the bytes to get a single value > 127
and pulling that code point out of the font, it takes each byte separately,
pulls two code points in the range [128..255] out of the font and displays
those.

It would be better to fix this by handling the utf8 and taking characters
from the iso10646-encoded fonts instead of trying to pick an 8-bit character
set that fits the user's locale and recoding all the messages into that.
Well, except that I don't know exactly where the fix is needed.

I have been googling a bit but didn't find anything that specifically
addresses Unicode in Xaw. Here's a good FAQ on the whole topic of utf8 and
Unicode in Unix/Linux, though:  http://www.cl.cam.ac.uk/~mgk25/unicode.html

On Tue, May 17, 2011 at 2:54 AM, h.g. muller <[email protected]> wrote:

> At 01:13 17-5-2011 -0700, Tim Mann wrote:
>
>> p.s. I just tried, and editing the .xboardrc file to change the fonts from
>> iso8846 to iso10646 did NOT fix the broken umlauts. So there is more to that
>> problem.
>>
>
> I don't really know anything about X-fonts, but apparently iso10646 and
> such stands for an encodings, and neither encoding is apparently UTF-8.
>
> To get it working, two conditions must be satisfied:
> *) The font must define the relevant glyphs
> *) The strings must be presented to the renderng agent in the encoding the
> latter expects (presumably defined by the selected font).
>
> I get the impression that the fonts defining the relevant glyphs for some
> languages are only available in encodings different from UTF-8. This does
> not need to be fatal; it merely means that the po files should be adapted to
> use the encodings available in one of the fonts available for the target
> language, and that the user should be advised to specify that font with
> XBoard.
>
> It should be easy enough to obtain a po file in another encoding. There are
> several inconveniences though: different strings in the po files will use
> different fonts, and thus might need different encodings. Window titles, for
> instance, are rendered by the window manager, which apparently is using
> UTF-8 (considering the screenshots of the Russian translation on WinBoard
> forum, http://www.open-aurec.com/wbforum/viewtopic.php?f=19&t=51772 ). So
> we would have to figure out which messages are window titles, and put those
> in different encodings in the po file. The same potentially holds for
> strings rendered in the clock and coords font, but there are only a handful
> of those.
>
> It would be much nicer if we could select a font based on the availability
> of glyphs, let XBoard note what encoding it is in, and then let it apply a
> recoding to it before the font is rendered. I.e. redefine the _(s) macro not
> as gettext(s), but as recode(fromEncoding, toEncoding, gettext(s)) . Where
> the fromEncoding would always be UTF-8, so the po files could be maintained
> in UTF-8.
>
_______________________________________________
Bug-XBoard mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-xboard

Reply via email to