[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-12 Thread Roundup Robot
Roundup Robot added the comment: New changeset b11507395ce4 by Serhiy Storchaka in branch '3.3': Add tests for issue #18183. http://hg.python.org/cpython/rev/b11507395ce4 New changeset 17c9f1627baf by Serhiy Storchaka in branch 'default': Add tests for issue #18183.

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-12 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- stage: patch review - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183 ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Dave Challis
New submission from Dave Challis: This occurred when attempting to decode invalid UTF-8 bytes using errors='replace', then attempting to lowercase the produced unicode string. This was also tested in python 2.7, but it doesn't occur there. Code to reproduce: x =

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- components: +Interpreter Core nosy: +serhiy.storchaka stage: - needs patch versions: +Python 3.4 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Minimal example: '\U0001\U0010'.lower() Traceback (most recent call last): File stdin, line 1, in module SystemError: invalid maximum character passed to PyUnicode_New -- ___ Python tracker

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Changes by Serhiy Storchaka storch...@gmail.com: -- assignee: - serhiy.storchaka ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183 ___ ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: It happens due to use fast MAX_MAXCHAR() which can produce maxchar out of range (0x1 | 0x10 MAX_UNICODE). -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: a = chr(0x84b2e)+chr(0x109710) a.lower() SystemError: invalid maximum character passed to PyUnicode_New The MAX_MAXCHAR() macro only works for 'maxchar' values, like 0xff, 0x... in do_upper_or_lower() it's used with arbitrary UCS4 values.

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Roundup Robot
Roundup Robot added the comment: New changeset 89b106d298a9 by Benjamin Peterson in branch '3.3': remove MAX_MAXCHAR because it's unsafe for computing maximum codepoitn value (see #18183) http://hg.python.org/cpython/rev/89b106d298a9 New changeset 668aba845fb2 by Benjamin Peterson in branch

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Benjamin Peterson
Benjamin Peterson added the comment: I simply removed the MAX_MAXCHAR micro-optimization, since it seems fairly unsafe. Interested parties can restore it safely if they wish. -- nosy: +benjamin.peterson resolution: - fixed status: open - closed ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread STINNER Victor
STINNER Victor added the comment: Oops, my MAX_MAXCHAR macro was too optimized :-) (the result is incorrect) It shows us that the test suite does not have enough test on non-BMP characters. -- ___ Python tracker rep...@bugs.python.org

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here are additional tests for this issue. -- keywords: +patch stage: needs patch - patch review status: closed - open Added file: http://bugs.python.org/file30533/test_issue18183.patch ___ Python tracker

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread STINNER Victor
STINNER Victor added the comment: +'\U0001\U0010'.lower() Why not checking the result of these calls? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183 ___

[issue18183] Calling .lower() on certain unicode string raises SystemError

2013-06-10 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The result is trivial. Is not checking the result distract an attention from the main issue? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue18183 ___