Hi all,

I just found a problem in the xreadlines method/module when used with 
codecs.open: the codec specified in the open does not seem to be taken into 
account by xreadlines which also returns byte-strings instead of unicode 
strings.

For example, if a file foo.txt contains some text encoded in latin1:

>>> import codecs
>>> f = codecs.open('foo.txt', 'r', 'utf-8', 'replace')
>>> [l for l in f.xreadlines()]
['\xe9\xe0\xe7\xf9\n']

But:

>>> import codecs
>>> f = codecs.open('foo.txt', 'r', 'utf-8', 'replace')
>>> f.readlines()
[u'\ufffd\ufffd']

The characters in latin1 are correctly "dumped" with readlines, but are still 
in latin1 encoding in byte-strings with xreadlines.

I tested with Python 2.1 and 2.3 on Linux and Windows: same result (I haven't 
Python 2.4 installed here)

Can anybody confirm the problem? Is this a bug? I searched this usegroup and 
the known Python bugs, but the problem did not seem to be reported yet.

TIA
-- 
python -c "print ''.join([chr(154 - ord(c)) for c in 
'U(17zX(%,5.zmz5(17;8(%,5.Z65\'*9--56l7+-'])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to