[Python-Dev] Unicode minus sign in numeric conversions
Expected behaviour: float('\N{MINUS SIGN}12.34') -12.34 Current behaviour: Traceback (most recent call last): ... ValueError: could not convert string to float: '−12.34' Please note: '\N{MINUS SIGN}' == '\u2212' -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unicode minus sign in numeric conversions
[Diverting to python-ideas, since this isn't as clear-cut as you think.] Why exactly is that expected behavior? What's the use case? (Surely you don't have a keyboard that generates \u2212 when you hit the minus key? :-) Is there a Unicode standard for parsing numbers? IIRC there are a variety of other things marked as digits in the Unicode standard -- do we do anything with those? If we do anything we should be consistent. For now, I think we *are* consistent -- we only support the ASCII representation of numbers. (And that's the only representation we generate as output as well -- think about symmetry too.) This page scares me: http://en.wikipedia.org/wiki/Numerals_in_Unicode --Guido On Sat, Jun 8, 2013 at 2:49 PM, Łukasz Langa luk...@langa.pl wrote: Expected behaviour: float('\N{MINUS SIGN}12.34') -12.34 Current behaviour: Traceback (most recent call last): ... ValueError: could not convert string to float: '−12.34' Please note: '\N{MINUS SIGN}' == '\u2212' -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unicode minus sign in numeric conversions
On 08/06/2013 23:30, Guido van Rossum wrote: [Diverting to python-ideas, since this isn't as clear-cut as you think.] Why exactly is that expected behavior? What's the use case? (Surely you don't have a keyboard that generates \u2212 when you hit the minus key? :-) Is there a Unicode standard for parsing numbers? IIRC there are a variety of other things marked as digits in the Unicode standard -- do we do anything with those? If we do anything we should be consistent. For now, I think we *are* consistent -- we only support the ASCII representation of numbers. (And that's the only representation we generate as output as well -- think about symmetry too.) We already recognise at least some of the digits: float(\N{ARABIC-INDIC DIGIT ONE}) 1.0 (I haven't check all of them!) This page scares me: http://en.wikipedia.org/wiki/Numerals_in_Unicode --Guido On Sat, Jun 8, 2013 at 2:49 PM, Łukasz Langa luk...@langa.pl wrote: Expected behaviour: float('\N{MINUS SIGN}12.34') -12.34 Current behaviour: Traceback (most recent call last): ... ValueError: could not convert string to float: '−12.34' Please note: '\N{MINUS SIGN}' == '\u2212' ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unicode minus sign in numeric conversions
On Sun, 09 Jun 2013 01:39:59 +0100, MRAB pyt...@mrabarnett.plus.com wrote: On 08/06/2013 23:30, Guido van Rossum wrote: [Diverting to python-ideas, since this isn't as clear-cut as you think.] Why exactly is that expected behavior? What's the use case? (Surely you don't have a keyboard that generates \u2212 when you hit the minus key? :-) Is there a Unicode standard for parsing numbers? IIRC there are a variety of other things marked as digits in the Unicode standard -- do we do anything with those? If we do anything we should be consistent. For now, I think we *are* consistent -- we only support the ASCII representation of numbers. (And that's the only representation we generate as output as well -- think about symmetry too.) We already recognise at least some of the digits: float(\N{ARABIC-INDIC DIGIT ONE}) 1.0 (I haven't check all of them!) This page scares me: http://en.wikipedia.org/wiki/Numerals_in_Unicode http://bugs.python.org/issue6632 contains a bunch of good information relevant to this discussion. It looks like the argument there was that there is no standard for the signs, therefore we should not support them. As Guido said, the issue is non-trivial. --David ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com