Roundup Robot added the comment:
New changeset b11507395ce4 by Serhiy Storchaka in branch '3.3':
Add tests for issue #18183.
http://hg.python.org/cpython/rev/b11507395ce4
New changeset 17c9f1627baf by Serhiy Storchaka in branch 'default':
Add tests for issue #18183.
Changes by Serhiy Storchaka storch...@gmail.com:
--
stage: patch review - committed/rejected
status: open - closed
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
___
New submission from Dave Challis:
This occurred when attempting to decode invalid UTF-8 bytes using
errors='replace', then attempting to lowercase the produced unicode string.
This was also tested in python 2.7, but it doesn't occur there.
Code to reproduce:
x =
Changes by Serhiy Storchaka storch...@gmail.com:
--
components: +Interpreter Core
nosy: +serhiy.storchaka
stage: - needs patch
versions: +Python 3.4
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
Serhiy Storchaka added the comment:
Minimal example:
'\U0001\U0010'.lower()
Traceback (most recent call last):
File stdin, line 1, in module
SystemError: invalid maximum character passed to PyUnicode_New
--
___
Python tracker
Changes by Serhiy Storchaka storch...@gmail.com:
--
assignee: - serhiy.storchaka
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
___
___
Serhiy Storchaka added the comment:
It happens due to use fast MAX_MAXCHAR() which can produce maxchar out of range
(0x1 | 0x10 MAX_UNICODE).
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
Amaury Forgeot d'Arc added the comment:
a = chr(0x84b2e)+chr(0x109710)
a.lower()
SystemError: invalid maximum character passed to PyUnicode_New
The MAX_MAXCHAR() macro only works for 'maxchar' values, like 0xff, 0x...
in do_upper_or_lower() it's used with arbitrary UCS4 values.
Roundup Robot added the comment:
New changeset 89b106d298a9 by Benjamin Peterson in branch '3.3':
remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value
(see #18183)
http://hg.python.org/cpython/rev/89b106d298a9
New changeset 668aba845fb2 by Benjamin Peterson in branch
Benjamin Peterson added the comment:
I simply removed the MAX_MAXCHAR micro-optimization, since it seems fairly
unsafe. Interested parties can restore it safely if they wish.
--
nosy: +benjamin.peterson
resolution: - fixed
status: open - closed
___
STINNER Victor added the comment:
Oops, my MAX_MAXCHAR macro was too optimized :-) (the result is incorrect)
It shows us that the test suite does not have enough test on non-BMP characters.
--
___
Python tracker rep...@bugs.python.org
Serhiy Storchaka added the comment:
Here are additional tests for this issue.
--
keywords: +patch
stage: needs patch - patch review
status: closed - open
Added file: http://bugs.python.org/file30533/test_issue18183.patch
___
Python tracker
STINNER Victor added the comment:
+'\U0001\U0010'.lower()
Why not checking the result of these calls?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
___
Serhiy Storchaka added the comment:
The result is trivial. Is not checking the result distract an attention from
the main issue?
--
___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue18183
___
14 matches
Mail list logo