Marc-Andre Lemburg added the comment:
On 12.06.2013 07:32, Alexander Belopolsky wrote:
>
> Alexander Belopolsky added the comment:
>
> It looks like we a approaching consensus on some points:
>
> 1. Mixed script numerals should be disallowed.
> 2. '\N{MINUS SIGN}' should be accepted as an alternative to '\N{HYPHEN-MINUS}'
>
> Open question: should we accept fullwidth + and -, sub/superscript variants
> etc.? I believe rather than debating variant codepoints one by one, we
> should consider applying NFKC (compatibility) normalization to unicode
> strings to be interpreted as numbers. This would allow parsing strings like
> this:
>
>>>> float(normalize('NFKC', '\N{FULLWIDTH HYPHEN-MINUS}\N{DIGIT ONE FULL
>>>> STOP}\N{FULLWIDTH DIGIT TWO}'))
> -1.2
While it would solve these cases, I think that would cause a
significant performance hit.
Perhaps we could do this in two phases:
1. detect whether the string uses non-ASCII digits and symbols
2. if it does, apply normalization and then use the decimal codec
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10581>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com