RE: Detecting encoding in Plain text

Chris Pratley Thu, 08 Jan 2004 15:28:05 -0800

If you are on the Windows platform, look at mlang.dll, and at the
IMultiLanguage2 and IMultiLanguage3 APIs, which provide this service. As
others have noted you will get false detections with too little or
ambiguous data, but you may be quite surprised at just how accurate this
detection is (sometimes just one character outside of the "ASCII"
repertoire), since there is language frequency data used as well as
merely encoding rules.

Chris

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Brijesh Sharma
Sent: January 8, 2004 3:08 AM
To: Unicode Mailing List
Subject: Detecting encoding in Plain text

Hi All,
I am new to Unicode.
I writing a small tool to get text from a txt file into a edit box.
Now this txt file could be in any encoding for eg(UTF-8,UTF-16,Mac
Roman,Windows ANSI,Western (ISO-8859-1),JIS,Shift-JIS etc)
My problem is that I can distinguish between UTF-8 or UTF-16 using the
BOM.
But how do I auto detect the others.
Any kind of help will be appreciated.

Regards
Brijesh Sharma 

"You're not obligated to win. You're obligated to keep trying to do the
best
you can every day."

RE: Detecting encoding in Plain text

Reply via email to