Antoine Pitrou pit...@free.fr added the comment:
Here is a patch.
--
keywords: +patch
stage: - patch review
Added file: http://bugs.python.org/file23686/utf7.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
Martin v. Löwis mar...@v.loewis.de added the comment:
Can you please regenerate the patch against default's head?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
___
Antoine Pitrou pit...@free.fr added the comment:
It's a patch for 3.2.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
___
___
Martin v. Löwis mar...@v.loewis.de added the comment:
Please don't use git-style diffs then, since otherwise the review can't figure
out what the patch applies to (and neither could I figure that out).
--
___
Python tracker rep...@bugs.python.org
Antoine Pitrou pit...@free.fr added the comment:
Here is a non-git diff then :)
--
Added file: http://bugs.python.org/file23688/utf7-nogit.patch
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
Martin v. Löwis mar...@v.loewis.de added the comment:
LGTM.
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
___
___
Python-bugs-list
Roundup Robot devn...@psf.upfronthosting.co.za added the comment:
New changeset ddfcb0de564f by Antoine Pitrou in branch '3.2':
Issue #1: The UTF-7 decoder now accepts lone surrogates
http://hg.python.org/cpython/rev/ddfcb0de564f
New changeset 250091e60f28 by Antoine Pitrou in branch
Antoine Pitrou pit...@free.fr added the comment:
I made a little fix to the patch for wide unicode builds and then committed it.
Thank you!
--
resolution: - fixed
stage: patch review - committed/rejected
status: open - closed
___
Python tracker
Ezio Melotti ezio.melo...@gmail.com added the comment:
FWIW Wikipedia says Other characters must be encoded in UTF-16 (hence U+1
and higher would be encoded into surrogates) and then in modified Base64.
So one possible interpretation is that while encoding a non-BMP char, it should
be
New submission from Antoine Pitrou pit...@free.fr:
The utf-7 codec happily encodes lone surrogates, but it won't decode them:
\ud801.encode(utf-7)
b'+2AE-'
\ud801\ud801.encode(utf-7)
b'+2AHYAQ-'
\ud801.encode(utf-7).decode(utf-7)
Traceback (most recent call last):
File stdin, line 1, in
Changes by Petri Lehtinen pe...@digip.org:
--
nosy: +petri.lehtinen
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue1
___
___
Python-bugs-list
Martin v. Löwis mar...@v.loewis.de added the comment:
RFC 2152 talks about encoding 16-bit unicode, and clarifies
Surrogate pairs (UTF-16) are converted by treating each half
of the pair as a separate 16 bit quantity (i.e., no special
treatment).
So lone surrogates clearly should be
12 matches
Mail list logo