New submission from STINNER Victor <victor.stin...@haypocalc.com>:

I added a test in _PyUnicode_CheckConsistency() (in debug mode) to ensure that 
all characters of a string are in the range U+0000-U+10FFFF. Locale tests are 
now failing on Solaris:

-----------------------------------
[ 28/361] test__locale
Assertion failed: maxchar <= 0x10FFFF, file Objects/unicodeobject.c, line 408
Fatal Python error: Aborted

Current thread 0x00000001:
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test__locale.py", 
line 134 in test_float_parsing
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 
385 in _executeTestPart
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 
440 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/case.py", line 
492 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 
105 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 
67 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 
105 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/suite.py", line 
67 in __call__
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/unittest/runner.py", 
line 168 in run
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 
1368 in _run_suite
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/support.py", line 
1402 in run_unittest
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/test__locale.py", 
line 139 in test_main
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/regrtest.py", line 
1203 in runtest_inner
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/regrtest.py", line 
906 in runtest
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/regrtest.py", line 
709 in main
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/test/__main__.py", line 
13 in <module>
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/runpy.py", line 73 in 
_run_code
  File "/home2/buildbot/slave/3.x.loewis-sun/build/Lib/runpy.py", line 160 in 
_run_module_as_main
*** Error code 134
-----------------------------------

The problem is that strxfrm() and wcsxfrm() return strange results for the 
string "a" and the english locale (e.g. en_US.UTF-8).

strxfrm(buffer, "a\0", 100) returns 21 (bytes) but only 2 bytes are written 
("\x01\x00"). The next bytes are unchanged.

wcsxfrm(buffer, L"a\0", 100) returns 7 (characters), the 7 characters are 
written but they are in range U+1010101..U+1010163, whereas the maximum 
character of Unicode 6.0 is U+10FFFF (U+101xxxx vs U+10xxxx).

Output of the attached program, strxfrm.c, on OpenSolaris:
-----------------------------------
strxfrm: len=21
0x01
0x00
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff
0xff

wcsxfrm: len=7
U+1010163
U+1010101
U+1010103
U+1010101
U+1010103
U+1010101
U+1010101
-----------------------------------

I don't know if it's normal that wcsxfrm() writes characters in the range 
U+1010101..U+1010163.

Is Python supposed to support characters outside U+0000-U+10FFFF range? 
chr(0x10FFFF+1) raises a ValueError.

----------
components: Unicode
files: strxfrm.c
messages: 148017
nosy: ezio.melotti, haypo, loewis, pitrou
priority: normal
severity: normal
status: open
title: TestEnUSCollation.test_strxfrm() fails on Solaris
versions: Python 3.3
Added file: http://bugs.python.org/file23741/strxfrm.c

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13441>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to