On 1 Nov 2008, at 8:43, qiaogang chen wrote: > OK, The idea to take UTF8 as future's main coding method is a good > idea. But now in china, 2 byte chinese character mix with 1 byte > english character in ANSI text is widly spreaded.
Yes - everyone struggles with this! It is not only in China that people are having problems with this change (although I suspect it is worse in places where mutli-byte encodings are frequently mixed with ASCII-style text) - for example, here in Europe the often used cp1252 code page has many single-byte values at code-points that are illegal control characters in Unicode/UTF-8. Which can be tricky to detect until it breaks something! > As your wish, I provide the UTF8 chinese text in hex format as > following: > ef bb bf 20 20 31 20 e5 b9 bf e8 a5 bf e5 8c 97 e9 83 a8 e6 b9 be > e6 b5 b7 e5 9f 9f e6 b5 b7 e4 b8 8a e6 b2 bb e5 ae 89 > e5 9f ba e7 a1 80 e4 bf a1 e6 81 af e7 b3 bb e7 bb 9f 32 20 e7 ae > a1 e7 90 86 33 20 e7 ae a1 e7 90 86 e6 b8 af e5 8f a3 > e7 a0 81 e5 a4 b4 34 20 e7 ae a1 e7 90 86 e5 85 bb e6 ae 96 e5 9c > ba 35 20 e7 ae a1 e7 90 86 e6 b8 94 e8 88 b9 36 20 e7 > ae a1 e7 90 86 e6 b8 94 e6 b0 91 37 20 e7 b3 bb e7 bb 9f 38 20 e5 > b8 ae e5 8a a9 39 20 e9 80 80 e5 87 ba ff I assume the 0xFF at the end is just a typo? (It is not valid UTF-8) Replacing that with 0x00 seems to form a valid UTF-8 string, and in my tests it seems to work just fine. I don't seem to see the effects you report. This appears as a list of entries numbered 1 to 9 when I test it (although I can't read the actual words!) Also, for what it is worth, the BOM at the start is probably not useful in a UTF-8 string (UTF-8 has no byte order issues) so it is probably only useful if you are using it to detect that the string is in a UTF-8 encoding - is that what you intend? > I save this text in a UTF8 text file, and read it into memory in my > app, Then assign them to widget label and menu item. some of the > text are display in mess. What make me upset is that this > phenomenon is not consistant, for example, I run the app, I found a > menu item is display in mess, when I quit the app ,and run > again ,it maybe display correct on the certain item ,but another > item is display wrong which works good in previous run. Hmm - OK. I am not seeing this in my testing. Things that occur to me, in no particular order; - make sure the all strings are null terminated correctly, in case the string handling is running off the ends of the buffers. - many fltk widgets assume that the string you pass them is in static storage, and refer directly to the pointer you pass. So you must make sure the char array you pass in is not modified elsewhere, and is not declared on stack local storage. If you must pass a string in a temporary array, then you can use the widget->copy_label() methods rather then the widget->label() methods. This will cause the widget to explicitly copy the string rather than referring to your original string... It sounds, from your description, as if the strings are maybe being created on the stack or heap in temporary storage and are then getting corrupted as the program runs - that would account for why the observed corruption varies from run to run. Try modifying your code to ensure the strings have correct persistence and see if that helps or not. Hope that helps, -- Ian _______________________________________________ fltk mailing list [email protected] http://lists.easysw.com/mailman/listinfo/fltk

