Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

James Y Knight Wed, 06 May 2009 07:42:18 -0700

On May 6, 2009, at 5:39 AM, Stephen J. Turnbull wrote:

Now, with Python's file system encoding == UTF-8 or any packed EUC,
and more than a handful of Shift JIS or Big5 characters in file names,
one is *almost certain* to encounter ASCII as the second byte of a
multibyte sequence.  PEP 383 can't handle this

Hm, I haven't tried the implementation, but I thought that what wouldhappen is:'\x85a'.decode('utf-8', 'utf8b/surrogate-replace/whateveritscalled') -> u'\uDC85a'

If that indeed doesn't happen, that's certainly a defect and should beremedied.

, but it is sure to be
the most common use case for PEP 383 in East Asia.


Yes.

James
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 383 update: utf8b is now the error handler

Reply via email to