STINNER Victor <victor.stin...@haypocalc.com> added the comment:

Extract of the Unicode standard: "Use of a BOM is neither required nor 
recommended for UTF-8, but may be encountered in contexts where UTF-8 data is 
converted from other encoding forms that use a BOM or where the BOM is used as 
a UTF-8 signature".

See also the following section explaing issues with UTF-8 BOM:
http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

I agree that Python should handle (UTF-8) BOM to read a CSV file (#7185), 
because the file format is common on Windows.

But msgfmt is an UNIX tool: I would expect that Python behaves like the 
original msgfmt tool, fail with a fatal error on the BOM "invisible character". 
How do you explain to a user msgfmt fails but not msgfmt.py?

About the patch: *ignore* the BOM is not a good idea. The BOM announces the 
encoding (eg. UTF-8): if a Content-Type header announces another encoding, you 
should raise an error.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue1697943>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to