Lawrence D'Oliveiro wrote:
In message <mailman.1466.1286556950.29448.python-l...@python.org>, Ethan
Furman wrote:
Lawrence D'Oliveiro wrote:
But they can only recognize it as a BOM if they assume UTF-8 encoding to
begin with. Otherwise it could be interpreted as some other coding.
Not so. The first three bytes are the flag.
But this is just a text file. All parts of its contents are text, there is
no “flag”.
If you think otherwise, then tell us what are these three “flag” bytes for a
Windows-1252-encoded text file?
MS treats those first three bytes as a flag -- if they equal the BOM, MS
treats it as UTF-8, if they equal anything else, MS does not treat it as
UTF-8.
If you think otherwise, hop on an MS machine and test it out.
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list