Martin v. Löwis added the comment: I stand by that comment: IsWhiteSpace should use the Unicode White_Space property. Since FS/GS/RS/US are not in the White_Space property, it's correct that the int conversion fails. It's incorrect that .isspace() gives true.
There are really several bugs here: - .isspace doesn't use the White_List property - int conversion ultimately uses Py_ISSPACE, which conceptually could deviate from the Unicode properties (as it is byte-based). This is not really an issue, since they indeed match. I propose to fix this by parsing PropList.txt, and generating _PyUnicode_IsWhitespace based on the White_Space property. For efficiency, it should also generate a fast-lookup array for the ASCII case, or just use _Py_ctype_table (with a comment that this table needs to match PropList White_Space). _Py_ascii_whitespace should go. Contributions are welcome. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18236> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com