Walter Dörwald added the comment: jgsack wrote: > > If codec utf_8 or utf_8_sig were to accept input with or without the > 3-byte BOM, and write it as currently specified without/with the BOM > respectively, then _I_ can reread again with either utf_8 or utf_8_sig.
That's exactly what the utf_8_sig codec does. The decoder accepts input with or without the BOM (the (first) BOM doesn't get returned). The encoder always prepends a BOM. Or do you want a codec that behaves like utf_8 on reading and like utf_8_sig on writing? Such a codec indead indead wouldn't roundtrip. ---------- nosy: +doerwalter __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue1328> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com