Can you please expand on your statement that UTF-8 should never have a BOM?
Having one makes it very easy to distinguish a text file that contains UTF-8
from one that contains text in the system default MBCS encoding.

You may not be surprised to learn that Microsoft (or, at least, one of its
programmers) does not agree with you. When I save a file from Notepad on
Windows XP in UTF-8, the file contains a BOM.

(I have no connection with Microsoft - I'm just a programmer who has to
write code to import text files from time to time!)

Thanks

- rick cameron

-----Original Message-----
From: Asmus Freytag [mailto:[EMAIL PROTECTED]] 
Sent: Thursday, 14 February 2002 17:46
To: Martin Kochanski; [EMAIL PROTECTED]
Subject: Re: Unicode and end users


At 09:22 AM 2/14/02 +0000, Martin Kochanski wrote:
>Are there, in fact, many circumstances in which it is necessary for an 
>end
>user to create files that do *not* have a BOM at the beginning?

In principle this is a requirement for data being labelled *external to the 
date* as being in either UTF-16BE or UTF-16LE (ditto for UTF-32). These 
formats *must not* have a BOM.

However, it may be the case in practice that protocols in which documents 
are labelled that way, don't accept separately edited documents, so this 
may be moot.

UTF-8 should *never* contain the BOM.
A./

Reply via email to