Serhiy Storchaka storch...@gmail.com added the comment:
I checked out. Files opened in utf-8-sig are seekable.
open('test', 'w', encoding='utf-8-sig').write('qwerty\nйцукен\n')
open('test', 'r', encoding=utf-8).read()
'\ufeffqwerty\nйцукен\n'
open('test', 'r', encoding=utf-8-sig).read()
R. David Murray rdmur...@bitdance.com added the comment:
Serhiy, the bug is about csv in particular. Can you confirm that using
utf-8-sig allows one to process a file with a bom using the csv module?
--
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka storch...@gmail.com added the comment:
I ran the script above (only replaced 'utf-8' on 'utf-8-sig') and did not see
anything strange. I looked at the source (cvs.py and _cvs.c) and also did not
see anything that could lead to this effect. If the bug exists, it in utf-8-sig
R. David Murray rdmur...@bitdance.com added the comment:
I wasn't sure which script you were referring to, so I checked it myself and
got the same results as you: after the seek(0) on the file object opened with
utf-8-sig, csv read all the lines in the file, including reading the header
line
Serhiy Storchaka storch...@gmail.com added the comment:
I was referring to the script inlined in the message
http://bugs.python.org/issue7185#msg94340 .
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7185
Changes by Skip Montanaro s...@pobox.com:
--
nosy: -skip.montanaro
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7185
___
___
Python-bugs-list
Istvan Szirtes istvan.szir...@gmail.com added the comment:
Hi Everyone,
I have tried the utf-8-sig and it does not work in this case or
rather I think not the csv module is wrong. The seek() does not work
correctly in the csv file or object.
With utf-8-sig the file is opend correctly and the
New submission from Istvan Szirtes istvan.szir...@gmail.com:
The CSV module try to read a .csv file which is coded in utf-8 with utf-
8 BOM.
The first row in the csv file is
[value,vocal,vocal,vocal,vocal]
in hex:
value,vocal,vocal,vocal,vocal
the reader can not read corectly the first
Walter Dörwald wal...@livinglogic.de added the comment:
http://docs.python.org/library/csv.html#module-csv states:
This version of the csv module doesn’t support Unicode input. Also,
there are currently some issues regarding ASCII NUL characters.
Accordingly, all input should be UTF-8 or
R. David Murray rdmur...@bitdance.com added the comment:
The restrictions were theoretically removed in 3.1, and the 3.1
documentation has been updated to reflect that. If 3.1 CSV doesn't
handle unicode, then that is a bug.
--
nosy: +r.david.murray
priority: - normal
stage: - test
Walter Dörwald wal...@livinglogic.de added the comment:
Then the solution should simply be to use utf-8-sig as the encoding,
instead of utf-8.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7185
R. David Murray rdmur...@bitdance.com added the comment:
In that case we should update the docs. Istvan, can you confirm that
this solves your problem?
--
assignee: - georg.brandl
components: +Documentation
nosy: +georg.brandl
stage: test needed - needs patch
versions: +Python 3.2
Changes by Skip Montanaro s...@pobox.com:
--
nosy: +skip.montanaro
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7185
___
___
Python-bugs-list
13 matches
Mail list logo