Re: [Python-Dev] Quick sum up about open() + BOM

Victor Stinner Sat, 09 Jan 2010 05:51:58 -0800

Hi,

Le samedi 09 janvier 2010 13:45:58, vous avez écrit :
> > Note: I implemented the BOM check in TextIOWrapper; so it's already
> > usable for any file-like object.
> 
> Yes, but the implementation is limited to just BOM checking
> and thus only supports UTF-8-SIG, UTF-16 and UTF-32.


Sure, but that's already better than no BOM check :-) It looks like many 
people would apprecite UTF-8-SIG detection, since this encoding is common on 
Windows.

> BTW: I haven't looked at your implementation, but what happens
> when your BOM check fails ? Will the implementation add the
> already read bytes back to a buffer ?

My implementation is done between buffer.read() and decoder.decode(data). If 
there is a BOM: set the encoding and remove the BOM bytes from the byte 
string. Otherwise, use another algorithm to choose the encoding and leave the 
byte string unchanged.

It can be seen as a codec: it works like UTF-16 and UTF-32 codecs ;-)

> AFAIK, we currently have a moratorium on changes to Python
> builtins. How does that match up with the proposed changes ?

Oh yes, I forgot the moratorium. In all solutions, some of them don't change 
the API. Eg. Antoine proposed to leave the API unchanged: open(file) => 
open(file) :-) I don't know if it's compatible with the moratorium or not.

-- 
Victor Stinner
http://www.haypocalc.com/
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Quick sum up about open() + BOM

Reply via email to