[issue10581] Review and document string format accepted in numeric data type constructors

Nick Coghlan Thu, 13 Jun 2013 05:13:32 -0700

Nick Coghlan added the comment:

I think PEP 393 gives us a quick way to fast parsing: if the max char is < 128, 
just roll straight into normal processing, otherwise do the normalisation and 
"all decimal digits are from the same script" steps.


There are almost certainly better ways to do the script translation, but the 
example below tries to just do the "convert to ASCII" step to avoid duplicating 
the +/- and decimal point processing logic:

    if max_char(arg) >= 128:
        arg = toNFKC(arg)
        originals = set()
        converted = []
        for c in arg:
            try:
                d = str(unicodedata.decimal(c))
            except ValueError:
                d = c
            else:
                originals.add(c)
            converted.append(d)
        if (max(originals) - min(originals)) >= 10:
            raise ValueError("%s mixes digits from multiple scripts" % arg)
        arg = "".join(converted)
    result = parse_ascii_number(arg)


P.S. I don't think the base argument is especially applicable ('0x' is rejected 
because 'x' is not a base 10 digit and we allow a base of '0' to request "use 
int literal base markers").

----------
nosy: +ncoghlan

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10581>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10581] Review and document string format accepted in numeric data type constructors

Reply via email to