[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-14 Thread Antoine Pitrou
Antoine Pitrou added the comment: Backported to trunk and 2.6.2 in r67762 and r67764. -- resolution: -> fixed status: open -> closed ___ Python tracker ___ _

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-14 Thread Antoine Pitrou
Antoine Pitrou added the comment: Committed to py3k and release30-maint in r67760 and r67759. Needs backporting to 2.x. -- priority: release blocker -> normal ___ Python tracker

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-13 Thread Gregory P. Smith
Gregory P. Smith added the comment: utf16_newlines2.patch looks good to me. This is a data corruption issue. If it is deferred for 3.0.1 it must be fixed in 3.0.2. +1 on putting this in 3.0.1. -- assignee: -> pitrou nosy: +gregory.p.smith priority: -> release blocker

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: This new variant also removes the dangerous hack in getstate / setstate. Added file: http://bugs.python.org/file12345/utf16_newlines2.patch ___ Python tracker _

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a simpler patch with a different approach and a lot of tests. The advantage is that it doesn't break the API. Added file: http://bugs.python.org/file12344/utf16_newlines.patch ___ Python tracker

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-13 Thread Antoine Pitrou
Changes by Antoine Pitrou : ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-13 Thread Antoine Pitrou
Antoine Pitrou added the comment: A couple of suggestions: - if IncrementalNewlineDecoder gets an encoding argument, it can also instantiate the decoder itself; that way the API is a bit simpler - to encode '\r' without the BOM, you can e.g. use an incremental encoder and encode it twice: >>>

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Ugly patch to fix this issue: - add more regression tests for charsets UTF-16*, UTF-32* - add mandatory argument "encoding" to io.IncrementalNewlineDecoder constructor => BREAK THE API - use the encoding the encode "\r" - most ulgy hack:

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a patch for test_io.py: check the problem by adding new encodings to TextIOWrapperTest.testNewlines(). -- keywords: +patch Added file: http://bugs.python.org/file12297/test_io.patch ___ Py

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file12296/incremental_newline_decoder_bug.py ___ Python tracker <[EMAIL PROTECTED]> ___ __

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file12295/dec.py ___ Python tracker <[EMAIL PROTECTED]> ___

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Smaller example to demonstrate the problem. Added file: http://bugs.python.org/file12295/dec.py ___ Python tracker <[EMAIL PROTECTED]> ___

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-08 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: The bug is in IncrementalNewlineDecoder, not in the codec nor TextIOWrapper. -- nosy: +haypo ___ Python tracker <[EMAIL PROTECTED]> _

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-07 Thread Antoine Pitrou
Changes by Antoine Pitrou <[EMAIL PROTECTED]>: -- nosy: +pitrou ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-list mailing

[issue4574] reading UTF16-encoded text file crashes if \r on 64-char boundary

2008-12-07 Thread John Machin
New submission from John Machin <[EMAIL PROTECTED]>: Problem in the newline handling in io.py, class IncrementalNewlineDecoder, method decode. It reads text files in 128- byte chunks. Converting CR LF to \n requires special case handling when '\r' is detected at the end of the decoded chunk in ca