Alexander Belopolsky added the comment: Martin v. Löwis wrote at #18236 (msg191687): > int conversion ultimately uses Py_ISSPACE, which conceptually could > deviate from the Unicode properties (as it is byte-based). This is not > really an issue, since they indeed match.
Py_ISSPACE matches Unicode White_Space property in the ASII range (first 128 code points) it differs for byte (code point) values from 128 through 255. This leads to the following discrepancy: >>> int('123\xa0') 123 but >>> int(b'123\xa0') Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 3: invalid start byte >>> int('123\xa0'.encode()) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: invalid literal for int() with base 10: '123\xa0' ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10581> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com