Distinguishing between ASCII and UTF8

Richard Gaskin Wed, 06 Oct 2010 13:23:29 -0700

I have an app that needs to auto-detect Unicode and plain text, andrender them correctly based on that auto-detection.

I have the UTF16 stuff working, but with UTF8 I have a problem: thereis no BOM to let me know if it's Unicode, and some plain text files willoccasionally have high-ASCII values in them (like the dagger symbol).

What patterns should I be looking for in the binary data of a file todistinguish UTF8 from plain text?


--
 Richard Gaskin
 Fourth World
 LiveCode training and consulting: http://www.fourthworld.com
 Webzine for LiveCode developers: http://www.LiveCodeJournal.com
 LiveCode Journal blog: http://LiveCodejournal.com/blog.irv
_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Distinguishing between ASCII and UTF8

Reply via email to