Re: Python 3.0 automatic decoding of UTF16

John Machin Sun, 07 Dec 2008 01:50:52 -0800

On Dec 7, 8:15 pm, Terry Reedy <[EMAIL PROTECTED]> wrote:
> John Machin wrote:
> > Here's the scoop: It's a bug in the newline handling (in io.py, class
> > IncrementalNewlineDecoder, method decode). It reads text files in 128-
> > byte chunks. Converting CR LF to \n requires special case handling
> > when '\r' is detected at the end of the decoded chunk n in case
> > there's an LF at the start of chunk n+1. Buggy solution: prepend b'\r'
> > to the chunk n+1 bytes and decode that -- suddenly with a 2-bytes-per-
> > char encoding like UTF-16 we are 1 byte out of whack.


> Please post this on the tracker so it can get included with other io
> work for 3.0.1.

I'm fiddling with a short bug-demo script right now.
--
http://mail.python.org/mailman/listinfo/python-list

Re: Python 3.0 automatic decoding of UTF16

Reply via email to