Matthew Barnett <[email protected]> added the comment:
I use Python 3, where len("\U00010337") == 2 on a narrow build.
Yes, wide Unicode on a narrow build is a problem:
>>> regex.findall("\\U00010337", "a\U00010337bc")
[]
>>> regex.findall("(?i)\\U00010337", "a\U00010337bc")
[]
I'm not sure how (or whether!) to handle surrogate pairs. It _would_ make
things more complicated.
I suppose the moral is that if you want to use wide Unicode then you really
should use a wide build.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue2636>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com