When there is no BOM, it need not be ansi or utf8, in fact it may be utf16be or
utf18le or other codepage as well. A smarter auto-detect ufread will read the
first 20 or 30 bytes to determine the actual encoding used.
Dan Bron wrote:
Gosi wrote:
Look at
http://www.jsoftware.com/jwiki/Scripts/Ufread
Thanks! Apparently the files are non-conformant. According to that page, if
the file doesn't start with a BOM, it's supposed to be in utf8. Unfortunately,
only one file starts with a BOM, yet none of them are in utf8. They're all in
the ASCII,0 format. I'll modify ufread to assume that format.
Is there a corresponding ufwrite? My goal is to modify these files (read,
transform, write back). I guess I could write one. I wish all this stuff were
100% transparent.
-Dan
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
--
regards,
bill
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm