Re: utf-16 not auto-detected when finding file

Kenichi Handa Wed, 30 Mar 2005 01:27:12 -0800

In article <[EMAIL PROTECTED]>, Jason Rumney <[EMAIL PROTECTED]> writes:


> Dave Love <[EMAIL PROTECTED]> writes:
>>  Yes.  Perhaps someone knows exactly what Windows does (assuming the
>>  only significant use of it is in Windows)?

> I would guess that the presence of a BOM is sufficient
> heuristics. Detecting 0 or other low byte values every second
> byte would work for Latin script based languages, but I don't think
> any heuristic like that would work on Asian text unless you could
> assume a specific language and use a dictionary.

I think BOM is not that safe because there are many charsets
who have normal letters at 0xFE and 0xFF.

What I'm thinking is to detect how LF (0x0A) is encoded
because Unicode doesn't have U+0A00.  If there's no LF, we
must give up detecting.

---
Ken'ichi HANDA
[EMAIL PROTECTED]


_______________________________________________
Emacs-pretest-bug mailing list
Emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug

Re: utf-16 not auto-detected when finding file

Reply via email to