>> Guessing UTF-8 is almost always correct. It's way over 99%
>> reliability. As long as the UTF-8 is well formed, it's likely to be
>> UTF-8. The longer the text of course, and the higher fraction of high
>> bytes, the liklier it's UTF-8.
>>
>> It can be done using my ElfData plugin, very easily, using
>> the .Verify function.
>>
>> I use this in practice, for my Encoding Master app. It guesses the
>> encoding of text files for you amoungst other things.
>
> A wothshilw suggestion so I tried Encoding Master and it also
> guesses; but not correctly.
>
> It comes up with UTF-16 which is closer in some respects
>
> But this really seems to be ISO Latin 1 data stored 16 bits per
> character
>
> Very odd and I'm really not sure how to recognize that

What? LSO-Latin1 using 16 bits per character *is* UTF-16. ISO latin-1  
uses codes 0-255 only... Unless we are going  to U+256 or above...  
it's the same.

--
http://elfdata.com/plugin/
"String processing, done right"


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>

Reply via email to