Re: [fltk.general] Chinese characters

imacarthur Sat, 01 Nov 2008 10:18:56 -0700

On 1 Nov 2008, at 8:43, qiaogang chen wrote:

> OK, The idea to take UTF8 as future's main coding method is a good  
> idea. But now in china, 2 byte chinese character mix with 1 byte  
> english character in ANSI text is widly spreaded.


Yes - everyone struggles with this! It is not only in China that  
people are having problems with this change (although I suspect it is  
worse in places where mutli-byte encodings are frequently mixed with  
ASCII-style text) - for example, here in Europe the often used cp1252  
code page has many single-byte values at code-points that are illegal  
control characters in Unicode/UTF-8. Which can be tricky to detect  
until it breaks something!

> As your wish, I provide the UTF8 chinese text in hex format as  
> following:
> ef bb bf 20 20 31 20 e5 b9 bf e8 a5 bf e5 8c 97 e9 83 a8 e6 b9 be  
> e6 b5 b7 e5 9f 9f e6 b5 b7 e4 b8 8a e6 b2 bb e5 ae 89
> e5 9f ba e7 a1 80 e4 bf a1 e6 81 af e7 b3 bb e7 bb 9f 32 20 e7 ae  
> a1 e7 90 86 33 20 e7 ae a1 e7 90 86 e6 b8 af e5 8f a3
> e7 a0 81 e5 a4 b4 34 20 e7 ae a1 e7 90 86 e5 85 bb e6 ae 96 e5 9c  
> ba 35 20 e7 ae a1 e7 90 86 e6 b8 94 e8 88 b9 36 20 e7
> ae a1 e7 90 86 e6 b8 94 e6 b0 91 37 20 e7 b3 bb e7 bb 9f 38 20 e5  
> b8 ae e5 8a a9 39 20 e9 80 80 e5 87 ba ff

I assume the 0xFF at the end is just a typo? (It is not valid UTF-8)  
Replacing that with 0x00 seems to form a valid UTF-8 string, and in  
my tests it seems to work just fine. I don't seem to see the effects  
you report.

This appears as a list of entries numbered 1 to 9 when I test it  
(although I can't read the actual words!)

Also, for what it is worth, the BOM at the start is probably not  
useful in a UTF-8 string (UTF-8 has no byte order issues) so it is  
probably only useful if you are using it to detect that the string is  
in a UTF-8 encoding - is that what you intend?


> I save this text in a UTF8 text file, and read it into memory in my  
> app, Then assign them to widget label and menu item. some of the  
> text are display in mess. What make me upset is that this  
> phenomenon is not consistant, for example, I run the app, I found a  
> menu item is display in mess, when I quit the app ,and run  
> again ,it maybe display correct on the certain item ,but another  
> item is display wrong which works good in previous run.

Hmm - OK. I am not seeing this in my testing.

Things that occur to me, in no particular order;

- make sure the all strings are null terminated correctly, in case  
the string handling is running off the ends of the buffers.

- many fltk widgets assume that the string you pass them is in static  
storage, and refer directly to the pointer you pass. So you must make  
sure the char array you pass in is not modified elsewhere, and is not  
declared on stack local storage. If you must pass a string in a  
temporary array, then you can use the widget->copy_label() methods  
rather then the widget->label() methods. This will cause the widget  
to explicitly copy the string rather than referring to your original  
string...

It sounds, from your description, as if the strings are maybe being  
created on the stack or heap in temporary storage and are then  
getting corrupted as the program runs - that would account for why  
the observed corruption varies from run to run. Try modifying your  
code to ensure the strings have correct persistence and see if that  
helps or not.

Hope that helps,
-- 
Ian










_______________________________________________
fltk mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk

Re: [fltk.general] Chinese characters

Reply via email to