Tex Texin scripsit:

> Interestingly, although I didn't study it in detail, looking at rfc 2376
> for prioritization over charset conflicts, it seems to recommend
> stripping the BOM when converting from utf-16 to other charsets (and
> without considering that ucs-4 would like to keep it). (section 5).

The point is not to try to convert it into an FFEF character or some
replacement thereof, like say "?".

> Also, in considering charset conflicts, 2376 fails to consider conflicts
> between signature and the encoding declaration. (I have a utf-16BE BOM
> and the encoding declaration is for utf-8...).

The encoding declaration is supposed to trump all.  So it is UTF-8, and
since 0xFF is illegal in UTF-8, you blow chunks...

> I'll have to check for a more up-to-date rfc.

There is none.

-- 
John Cowan <[EMAIL PROTECTED]>     http://www.reutershealth.com
I amar prestar aen, han mathon ne nen,    http://www.ccil.org/~cowan
han mathon ne chae, a han noston ne 'wilith.  --Galadriel, _LOTR:FOTR_

Reply via email to