On Wed, 10 Feb 2016 at 14:21 Georg Brandl <g.bra...@gmx.net> wrote: > This came up in python-ideas, and has met mostly positive comments, > although the exact syntax rules are up for discussion. > > cheers, > Georg > > > -------------------------------------------------------------------------------- > > PEP: 515 > Title: Underscores in Numeric Literals > Version: $Revision$ > Last-Modified: $Date$ > Author: Georg Brandl > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 10-Feb-2016 > Python-Version: 3.6 > > Abstract and Rationale > ====================== > > This PEP proposes to extend Python's syntax so that underscores can be > used in > integral and floating-point number literals. > > This is a common feature of other modern languages, and can aid > readability of > long literals, or literals whose value should clearly separate into parts, > such > as bytes or words in hexadecimal notation. > > Examples:: > > # grouping decimal numbers by thousands > amount = 10_000_000.0 > > # grouping hexadecimal addresses by words > addr = 0xDEAD_BEEF > > # grouping bits into bytes in a binary literal > flags = 0b_0011_1111_0100_1110 >
I assume all of these examples are possible in either the liberal or restrictive approaches? > > > Specification > ============= > > The current proposal is to allow underscores anywhere in numeric literals, > with > these exceptions: > > * Leading underscores cannot be allowed, since they already introduce > identifiers. > * Trailing underscores are not allowed, because they look confusing and > don't > contribute much to readability. > * The number base prefixes ``0x``, ``0o``, and ``0b`` cannot be split up, > because they are fixed strings and not logically part of the number. > * No underscore allowed after a sign in an exponent (``1e-_5``), because > underscores can also not be used after the signs in front of the number > (``-1e5``). > * No underscore allowed after a decimal point, because this leads to > ambiguity > with attribute access (the lexer cannot know that there is no number > literal > in ``foo._5``). > > There appears to be no reason to restrict the use of underscores otherwise. > > The production list for integer literals would therefore look like this:: > > integer: decimalinteger | octinteger | hexinteger | bininteger > decimalinteger: nonzerodigit [decimalrest] | "0" [("0" | "_")* "0"] > nonzerodigit: "1"..."9" > decimalrest: (digit | "_")* digit > digit: "0"..."9" > octinteger: "0" ("o" | "O") (octdigit | "_")* octdigit > hexinteger: "0" ("x" | "X") (hexdigit | "_")* hexdigit > bininteger: "0" ("b" | "B") (bindigit | "_")* bindigit > octdigit: "0"..."7" > hexdigit: digit | "a"..."f" | "A"..."F" > bindigit: "0" | "1" > > For floating-point literals:: > > floatnumber: pointfloat | exponentfloat > pointfloat: [intpart] fraction | intpart "." > exponentfloat: (intpart | pointfloat) exponent > intpart: digit (digit | "_")* > fraction: "." intpart > exponent: ("e" | "E") "_"* ["+" | "-"] digit [decimalrest] > > > Alternative Syntax > ================== > > Underscore Placement Rules > -------------------------- > > Instead of the liberal rule specified above, the use of underscores could > be > limited. Common rules are (see the "other languages" section): > > * Only one consecutive underscore allowed, and only between digits. > * Multiple consecutive underscore allowed, but only between digits. > > Different Separators > -------------------- > > A proposed alternate syntax was to use whitespace for grouping. Although > strings are a precedent for combining adjoining literals, the behavior can > lead > to unexpected effects which are not possible with underscores. Also, no > other > language is known to use this rule, except for languages that generally > disregard any whitespace. > > C++14 introduces apostrophes for grouping, which is not considered due to > the > conflict with Python's string literals. [1]_ > > > Behavior in Other Languages > =========================== > > Those languages that do allow underscore grouping implement a large > variety of > rules for allowed placement of underscores. This is a listing placing the > known > rules into three major groups. In cases where the language spec > contradicts the > actual behavior, the actual behavior is listed. > > **Group 1: liberal (like this PEP)** > > * D [2]_ > * Perl 5 (although docs say it's more restricted) [3]_ > * Rust [4]_ > * Swift (although textual description says "between digits") [5]_ > > **Group 2: only between digits, multiple consecutive underscores** > > * C# (open proposal for 7.0) [6]_ > * Java [7]_ > > **Group 3: only between digits, only one underscore** > > * Ada [8]_ > * Julia (but not in the exponent part of floats) [9]_ > * Ruby (docs say "anywhere", in reality only between digits) [10]_ > > > Implementation > ============== > > A preliminary patch that implements the specification given above has been > posted to the issue tracker. [11]_ > Is the implementation made easier or harder if we went with the Group 2 or 3 approaches? Are there any reasonable examples that the Group 1 approach allows that Group 3 doesn't that people have used in other languages? I'm +1 on the idea, but which approach I prefer is going to be partially dependent on the difficulty of implementing (else I say Group 3 to make it easier to explain the rules). -Brett
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com