On Apr 23, 2007, at 7:37 PM, Norman Palardy wrote: > > On 23-Apr-07, at 4:56 PM, [EMAIL PROTECTED] wrote: > >> On Apr 23, 2007, at 22:48 UTC, Norman Palardy wrote: >> >>> It seems to be ISO Latin 1 data represented in 16 bits per >>> character. >>> The data is almost exclusively NULL (00) followed by an ISO Latin 1 >>> code point. >> >> Sounds like UCS-2 or UTF-16, in, erm, little-endian format. >> >>> Using the Guess the Encoding mechanism in String Utils doesn't >>> suggest ISO Latin 1 or UCS 2 either. >> >> I'm surprised -- it should guess UTF-16, but if you're running on a >> Mac, it may be wrong-endian. Note that there are (if I've uploaded >> the >> latest!) two versions of GuessEncoding, one of which can properly >> report such wrong-endian cases, and the other which ignores it. I >> think you'll also find a function to swap every two bytes to >> correct a >> wrong-endian string. > > I'm still trying to see what I can do as this is really a very odd > format and seems, from what I can tell, to be completely counter to > the defined format for this data. > > Those functions are there > > I have code like > > t = f.OpenAsTextFile > line = t.readall > t.close > > t = f.OpenAsTextFile > dim orderIsWrong as boolean > dim te as TextEncoding = StringUtils.GuessEncoding > (line,orderIsWrong) > > while t.eof <> true > > line = t.ReadLine(te) > > if orderIsWrong = false then > line = StringUtils.SwapBytePairs(line) > end if > > wend > > the data ( in bytes) in this file look like > 00 42 00 45 00 47 00 49 00 4E 00 3A 00 56 00 43 .B.E.G.I.N.:.V.C > 00 41 00 52 00 44 00 0A 00 0A 00 56 00 45 00 52 .A.R.D.....V.E.R > > so it really does appear to look like UCS 2 but other portions make > me thing its ISO Latin1 in UCS 2 >
Perhaps the application generating the file is writing data using more than one encoding, either by accident or by design. Charles Yeomans _______________________________________________ Unsubscribe or switch delivery mode: <http://www.realsoftware.com/support/listmanager/> Search the archives: <http://support.realsoftware.com/listarchives/lists.html>
