On Tue, 1 Feb 2005, Ivan Pascal wrote:
Marc Aurele La France wrote:I don't know enough about this stuff to answer. Perhaps someone else will, and deal with http://www.mail-archive.com/devel%40xfree86.org/msg06114.html.
In the meantime I've backed out the change that's causing this.
I investigated this issue. In one of next messages Barry Scott mentioned " I recall I was seeing code behind XutfTextExtents fail to return any info without the patch - I think it was Xutf8TextExtents." Therefore I guess he has problem with Xutf8* ouput functions in non-UTF8 locale. And it is a real issue. These functions don't work without changes he proposed. But after such changes Xmb* output functions don't work.
Obviously, there is some difference in the interpretation of XLC_LOCALE data between these two families (Xutf8* and Xmb*).
The traditional iso2022 based font system requires that any input string be separated into portions where all chars belong to one charset. Corresponded converters cut out such portions from a text, label them with charset name and pass to the next procedure that finds an appropriate font for this portion drawing.
XLC_LOCALE file contains 'codeset' and 'fontset' records. The 'codeset' records describe rules for distinguishing chars of different charsets and chaset names suitable for each type of chars. The 'fontset' records just defines pairs 'charset name' <-> 'font encoding name' that tell what font should be used for each charset (the string labeled with this chaset name).
What we see in iso8859-15 locale file (and in many other one-byte encoding locales)? There are two codeset records
cs0 { side GL:Default ...... ct_encoding ISO8859-15:GL; ISO8859-1:GL } cs1 { side GR:Default ...... ct_encoding ISO8859-15:GR }
and two fontset records
fs0 { charset { name ISO8859-1:GL } font { primary ISO8859-15:GL .... } } fs1 { charset { name ISO8859-15:GR } font { primary ISO8859-15:GR } }
The second records in both pairs is the simplest case. The codeset record says that all 'right side' chars (codes > 127) belongs to the right side of ISO8859-15 charset and the fontset record says that the text portion labeled with 'right side of ISO8859-15 charset' can be drawn with some ISO8859-15 encoded font.
The first fontset record is simple too. It says that 'ISO8859-1 left side' labeled string can be writen with ISO8859-15 encoded font.
But the first codeset record is more complex. It contains two charset names for the same kind of chars ('left side', i.e. codes < 128). And Xutf8* and Xmb* functions use different names from this pair. (I know why, and can explain. But here I omit those details.)
Thus if one wants to output simple ASCII string like 'abcd', the 'mb' converter labels it as 'ISO8859-1: GL' whereas the 'utf8' converter labels it as 'ISO8859-15:GL'. And if the fontset record looks like 'charset: ISO8859-1 -> font: ISO8859-15' the second step procedure successfuly finds appropriate font for 'mb' converter output string but can do nothing for 'utf8' converter output. And after Barry's changes we got a reverse situation: the first fontset record became 'charset: ISO8859-15 -> font: ISO8859-15' that means that 'utf8' converter output has appropriate font but there is nothing useful for the 'mb' converter results.
The worst thing is that there is significant difference in these converters internals and if even we force the 'utf8' converter to choose ISO8859-1 charset we will not get the the behavior similar to 'mb' converters one. In this case the 'utf8' converter makes the 'right side' chars labeled with ISO8859-1:GR (execept the Euro sign). But such name also has not appropriate chaset->font pair among fontset records.
I would suggest a simple workaround: instead of changing the first fontset record add a similar one but with another charset name. I.e. the fontset section of XLC_LOCALE will look like:
fs0 { charset { name ISO8859-1:GL } font { primary ISO8859-15:GL }} fs1 { charset { name ISO8859-15:GR } font { primary ISO8859-15:GR }} fs2 { charset { name ISO8859-15:GL } font { primary ISO8859-15:GL }} In such case the first record would satisfy Xmb* functions and the third record would make Xutf8* functions happy. (Of course, I tested this solution. It works.)
OK. This makes some sense to me. But should something like this be done to all affected locates?
Marc.
+----------------------------------+-----------------------------------+ | Marc Aurele La France | work: 1-780-492-9310 | | Computing and Network Services | fax: 1-780-492-1729 | | 352 General Services Building | email: [EMAIL PROTECTED] | | University of Alberta +-----------------------------------+ | Edmonton, Alberta | | | T6G 2H1 | Standard disclaimers apply | | CANADA | | +----------------------------------+-----------------------------------+ XFree86 developer and VP. ATI driver and X server internals. _______________________________________________ Devel mailing list Devel@XFree86.Org http://XFree86.Org/mailman/listinfo/devel