On Tue, 1 Feb 2005, Ivan Pascal wrote:

Marc Aurele La France wrote:
I don't know enough about this stuff to answer.  Perhaps someone else will,
and deal with http://www.mail-archive.com/devel%40xfree86.org/msg06114.html.

In the meantime I've backed out the change that's causing this.

I investigated this issue.
In one of next messages Barry Scott mentioned
" I recall I was seeing code behind XutfTextExtents fail to
 return any info without the patch - I think it was Xutf8TextExtents."
Therefore I guess he has problem with Xutf8* ouput functions in non-UTF8
locale.  And it is a real issue.  These functions don't work without changes
he proposed.  But after such changes Xmb* output functions don't work.

Obviously, there is some difference in the interpretation of XLC_LOCALE data
between these two families (Xutf8* and Xmb*).

The traditional iso2022 based font system requires that any input string
be separated into portions where all chars belong to one charset.
Corresponded converters cut out such portions from a text, label them with
charset name and pass to the next procedure that finds an appropriate font
for this portion drawing.

XLC_LOCALE file contains 'codeset' and 'fontset' records. The 'codeset' records
describe rules for distinguishing chars of different charsets and chaset names
suitable for each type of chars.  The 'fontset' records just defines pairs
'charset name' <-> 'font encoding name' that tell what font should be used
for each charset (the string labeled with this chaset name).

What we see in iso8859-15 locale file (and in many other one-byte encoding
locales)?
There are two codeset records

cs0     { side            GL:Default
......
         ct_encoding     ISO8859-15:GL; ISO8859-1:GL }
cs1     { side            GR:Default
......
         ct_encoding     ISO8859-15:GR }

and two fontset records

fs0     { charset { name            ISO8859-1:GL }
         font    { primary         ISO8859-15:GL
                                            .... } }
fs1     { charset { name            ISO8859-15:GR }
         font    { primary         ISO8859-15:GR } }

The second records in both pairs is the simplest case. The codeset record says
that all 'right side' chars (codes > 127) belongs to the right side of
ISO8859-15 charset and the fontset record says that the text portion labeled
with 'right side of ISO8859-15 charset' can be drawn with some ISO8859-15
encoded font.

The first fontset record is simple too.  It says that 'ISO8859-1 left side'
labeled string can be writen with ISO8859-15 encoded font.

But the first codeset record is more complex.  It contains two charset names
for the same kind of chars ('left side', i.e. codes < 128).  And Xutf8* and
Xmb* functions use different names from this pair.  (I know why, and can
explain.  But here I omit those details.)

Thus if one wants to output simple ASCII string like 'abcd', the 'mb' converter
labels it as 'ISO8859-1: GL' whereas the 'utf8' converter labels it as
'ISO8859-15:GL'.
And if the fontset record looks like 'charset: ISO8859-1 -> font: ISO8859-15'
the second step procedure successfuly finds appropriate font for 'mb' converter
output string but can do nothing for 'utf8' converter output.  And after Barry's
changes we got a reverse situation: the first fontset record became
'charset: ISO8859-15 -> font: ISO8859-15' that means that 'utf8' converter
output has appropriate font but there is nothing useful for the 'mb' converter
results.

The worst thing is that there is significant difference in these converters
internals and if even we force the 'utf8' converter to choose ISO8859-1
charset we will not get the the behavior similar to 'mb' converters one.
In this case the 'utf8' converter makes the 'right side' chars labeled with
ISO8859-1:GR (execept the Euro sign).  But such name also has not appropriate
chaset->font pair among fontset records.

I would suggest a simple workaround: instead of changing the first fontset
record add a similar one but with another charset name.  I.e. the fontset
section of XLC_LOCALE will look like:

fs0     { charset {  name            ISO8859-1:GL  }
         font    {  primary         ISO8859-15:GL   }}
fs1     { charset {  name            ISO8859-15:GR  }
         font    {  primary         ISO8859-15:GR   }}
fs2     { charset {  name            ISO8859-15:GL  }
         font    {  primary         ISO8859-15:GL   }}
In such case the first record would satisfy Xmb* functions and the third record
would make Xutf8* functions happy.
(Of course, I tested this solution.  It works.)

OK. This makes some sense to me. But should something like this be done to all affected locates?


Marc.

+----------------------------------+-----------------------------------+
|  Marc Aurele La France           |  work:   1-780-492-9310           |
|  Computing and Network Services  |  fax:    1-780-492-1729           |
|  352 General Services Building   |  email:  [EMAIL PROTECTED]          |
|  University of Alberta           +-----------------------------------+
|  Edmonton, Alberta               |                                   |
|  T6G 2H1                         |     Standard disclaimers apply    |
|  CANADA                          |                                   |
+----------------------------------+-----------------------------------+
XFree86 developer and VP.  ATI driver and X server internals.
_______________________________________________
Devel mailing list
Devel@XFree86.Org
http://XFree86.Org/mailman/listinfo/devel

Reply via email to