On 09.01.10 14:38, Victor Stinner wrote: > Le samedi 09 janvier 2010 12:18:33, Walter Dörwald a écrit : >>> Good idea, I choosed open(filename, encoding="BOM"). >> >> On the surface this looks like there's an encoding named "BOM", but >> looking at your patch I found that the check is still done in >> TextIOWrapper. IMHO the best approach would to the implement a *real* >> codec named "BOM" (or "sniff"). This doesn't require *any* changes to >> the IO library. It could even be developed as a standalone project and >> published in the Cheeseshop. > > Why not, this is another solution to the point (2) (Check for a BOM while > reading or detect it before?). Which encoding would be used if there is not > BOM? UTF-8 sounds like a good choice.
UTF-8 might be a good choice, are the failback could be specified in the encoding name, i.e. open("file.txt", encoding="BOM-UTF-8") falls back to UTF-8, if there's no BOM at the start. This could be implemented via a custom codec search function (see http://docs.python.org/library/codecs.html#codecs.register for more info). Servus, Walter _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com