I have an app that needs to auto-detect Unicode and plain text, and render them correctly based on that auto-detection.

I have the UTF16 stuff working, but with UTF8 I have a problem: there is no BOM to let me know if it's Unicode, and some plain text files will occasionally have high-ASCII values in them (like the dagger symbol).

What patterns should I be looking for in the binary data of a file to distinguish UTF8 from plain text?

--
 Richard Gaskin
 Fourth World
 LiveCode training and consulting: http://www.fourthworld.com
 Webzine for LiveCode developers: http://www.LiveCodeJournal.com
 LiveCode Journal blog: http://LiveCodejournal.com/blog.irv
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to