>> Guessing UTF-8 is almost always correct. It's way over 99% >> reliability. As long as the UTF-8 is well formed, it's likely to be >> UTF-8. The longer the text of course, and the higher fraction of high >> bytes, the liklier it's UTF-8. >> >> It can be done using my ElfData plugin, very easily, using >> the .Verify function. >> >> I use this in practice, for my Encoding Master app. It guesses the >> encoding of text files for you amoungst other things. > > A wothshilw suggestion so I tried Encoding Master and it also > guesses; but not correctly. > > It comes up with UTF-16 which is closer in some respects > > But this really seems to be ISO Latin 1 data stored 16 bits per > character > > Very odd and I'm really not sure how to recognize that
What? LSO-Latin1 using 16 bits per character *is* UTF-16. ISO latin-1 uses codes 0-255 only... Unless we are going to U+256 or above... it's the same. -- http://elfdata.com/plugin/ "String processing, done right" _______________________________________________ Unsubscribe or switch delivery mode: <http://www.realsoftware.com/support/listmanager/> Search the archives: <http://support.realsoftware.com/listarchives/lists.html>
