Re: Byte Order Marks

Markus Scherer Thu, 19 Apr 2001 19:13:46 -0700

Yves Arrouye wrote:
> > If you don't have any clue about the byte order, but you know it is
> UTF-16, then assume BE.

> Then why is ICU mapping UTF-16 to UTF16_PlatformEndian and not
> UTF16_BigEndian?

ICU does not do Unicode-signature or other encoding detection as part of a converter. 
When you get text from some protocol, you need to instantiate a converter according to 
what you know about the encoding.

Note that guessing big-endian is only the last, desperate part of detecting the 
encoding. It is not the first choice. If the text is properly tagged (including maybe 
a signature), then you will never have to open a "UTF-16" converter.

On the other hand, if you get a file from your platform and it is in 16-bit Unicode, 
then you would appreciate the convenience of the auto-endian alias.

markus

Re: Byte Order Marks

Reply via email to