Re: [fltk.general] Chinese characters

MacArthur, Ian (SELEX GALILEO, UK) Fri, 31 Oct 2008 03:02:14 -0700


>     I have try to compiler fltk 1.3 with my code. most 
> chinese characters are display well, and editing chinese 
> works well.


At least that is a start!

> But it still has some problem. In window label 
> and menu label ,some chinese are still displayed in mess. 

OK - do you know if this is happening simply because the selected font
does not have glyphs for these characters, or is it because the code
point for the characters is encoded incorrectly in some way?

What are the Unicode code points for the failing glyphs? Are they from
the BMP (Basic Multilingual Plane) or from one of the "higher level"
planes?

Can you post some example strings that fail? (Ideally, as an array of
the underlying bytes rather than a string...) And some that work, and
maybe some that are a bit of both?

> According to my experience, utf8 support chinese not good 
> enough. Not only in fltk,but also in some web pages and 
> emails.

My experience has been the exact opposite; UTF-8 seems to be more
reliable, and where it has failed, it is because the input strings were
not well formed, not because the UTF-8 mechanism did not work.
That said, I am sure you have considerably more experience with Chinese
text handling than I do, of course!
There are many Chinese, Japanese and Korean contributors to the Unicode
project, so I imagine they must be working to make it a credible
solution.

> So I decline to an old method in china, that is , 
> using ANSI coding, when encounter a byte that greater than 
> 127, treat it as 2 byte character, but be careful, it need to 
> match 2 byte words analysising the whole line.
>    So,My proposal is , provid a code page option in fltk, so 
> that app developer can choice UTF8 or ANSI or other.
>    An other option is, in character processing , use 
> std::_wstring instead of char* .

I don't think there is any prospect of us switching to a wide-character
mechanism like that. They are hard to maintain (harder than UTF-8),
suffer from endian-ness issues, are not very portable, and there are too
many different variants...

UTF-8 gives reasonably effcient encoding, does not suffer from endian
issues, maps easily onto the Unicode code points and is available on
every platform we support, so it really is the only credible choice. It
is also the choice that all the operating systems we support have made -
even MS have embraced UTF-8 as a string encoding.

The archaic mechansims of code pages and platform specific multi-byte
encodings is an absolute nightmare to support, and no really workable
cross-platform standard exists (other than the multi-byte UTF encodings,
and if you are switching to one of them, you might as well go UTF-8
anyway...)






SELEX Sensors and Airborne Systems Limited
Registered Office: Sigma House, Christopher Martin Road, Basildon, Essex SS14 
3EL
A company registered in England & Wales.  Company no. 02426132
********************************************************************
This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.
********************************************************************

_______________________________________________
fltk mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk

Re: [fltk.general] Chinese characters

Reply via email to