This code is not helpful in dealing with endianness because I need to
know the endianness before I read in the data, not after. Right now
my code calling ReadAll(Encodings.UTF16) returns a gibberish string
because it assumes big-endian but the file is little-endian. I can
check the endianness using the trick above, but TextInputStream does
not give a way to specify it, hence I thought BinaryStream would help.

ElfData can help you very much with Unicode encodings. ElfData doesn't do non-Unicode encodings, except for Latin-1, which isn't MacOS's encoding!

Anyhow, with ElfData, you'd do something like this:

dim e as ElfData
e = bs.Readall
e = e.ConvertToUTF8 // this function reads BOMs

If your data has a BOM, it'll work just fine!!! No matter if it's UTF-16 LE or UTF-16 BE. Easy, eh? The BOM will be stripped if it exists.

What if the BOM doesn't exist? Well, if the file is XML or the first 4 characters are guaranteed ASCII, you can do something like this:

dim e as ElfData
e = bs.Readall
e.EncodingXMLGuess
e = e.ConvertToUTF8


That'll convert it to UTF8, even if it doesn't have a BOM :)


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to