iso10646 in the font name means Unicode. It doesn't mean utf8, since that's just a way of compressing a sequence of 32-bit Unicode code points into a sequence of 8-bit bytes.
What we see happening in xboard is that something (maybe Xaw, but I'm not 100% sure) does not understand that the string it is trying to display is in utf8. So whenever there is a character in the utf8 stream that takes up more than one byte, instead of repacking the bytes to get a single value > 127 and pulling that code point out of the font, it takes each byte separately, pulls two code points in the range [128..255] out of the font and displays those. It would be better to fix this by handling the utf8 and taking characters from the iso10646-encoded fonts instead of trying to pick an 8-bit character set that fits the user's locale and recoding all the messages into that. Well, except that I don't know exactly where the fix is needed. I have been googling a bit but didn't find anything that specifically addresses Unicode in Xaw. Here's a good FAQ on the whole topic of utf8 and Unicode in Unix/Linux, though: http://www.cl.cam.ac.uk/~mgk25/unicode.html On Tue, May 17, 2011 at 2:54 AM, h.g. muller <[email protected]> wrote: > At 01:13 17-5-2011 -0700, Tim Mann wrote: > >> p.s. I just tried, and editing the .xboardrc file to change the fonts from >> iso8846 to iso10646 did NOT fix the broken umlauts. So there is more to that >> problem. >> > > I don't really know anything about X-fonts, but apparently iso10646 and > such stands for an encodings, and neither encoding is apparently UTF-8. > > To get it working, two conditions must be satisfied: > *) The font must define the relevant glyphs > *) The strings must be presented to the renderng agent in the encoding the > latter expects (presumably defined by the selected font). > > I get the impression that the fonts defining the relevant glyphs for some > languages are only available in encodings different from UTF-8. This does > not need to be fatal; it merely means that the po files should be adapted to > use the encodings available in one of the fonts available for the target > language, and that the user should be advised to specify that font with > XBoard. > > It should be easy enough to obtain a po file in another encoding. There are > several inconveniences though: different strings in the po files will use > different fonts, and thus might need different encodings. Window titles, for > instance, are rendered by the window manager, which apparently is using > UTF-8 (considering the screenshots of the Russian translation on WinBoard > forum, http://www.open-aurec.com/wbforum/viewtopic.php?f=19&t=51772 ). So > we would have to figure out which messages are window titles, and put those > in different encodings in the po file. The same potentially holds for > strings rendered in the clock and coords font, but there are only a handful > of those. > > It would be much nicer if we could select a font based on the availability > of glyphs, let XBoard note what encoding it is in, and then let it apply a > recoding to it before the font is rendered. I.e. redefine the _(s) macro not > as gettext(s), but as recode(fromEncoding, toEncoding, gettext(s)) . Where > the fromEncoding would always be UTF-8, so the po files could be maintained > in UTF-8. >
_______________________________________________ Bug-XBoard mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-xboard
