On Wed, 10 Feb 2016 at 14:21 Georg Brandl <g.bra...@gmx.net> wrote:

> This came up in python-ideas, and has met mostly positive comments,
> although the exact syntax rules are up for discussion.
>
> cheers,
> Georg
>
>
> --------------------------------------------------------------------------------
>
> PEP: 515
> Title: Underscores in Numeric Literals
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2016
> Python-Version: 3.6
>
> Abstract and Rationale
> ======================
>
> This PEP proposes to extend Python's syntax so that underscores can be
> used in
> integral and floating-point number literals.
>
> This is a common feature of other modern languages, and can aid
> readability of
> long literals, or literals whose value should clearly separate into parts,
> such
> as bytes or words in hexadecimal notation.
>
> Examples::
>
>     # grouping decimal numbers by thousands
>     amount = 10_000_000.0
>
>     # grouping hexadecimal addresses by words
>     addr = 0xDEAD_BEEF
>
>     # grouping bits into bytes in a binary literal
>     flags = 0b_0011_1111_0100_1110
>

I assume all of these examples are possible in either the liberal or
restrictive approaches?


>
>
> Specification
> =============
>
> The current proposal is to allow underscores anywhere in numeric literals,
> with
> these exceptions:
>
> * Leading underscores cannot be allowed, since they already introduce
>   identifiers.
> * Trailing underscores are not allowed, because they look confusing and
> don't
>   contribute much to readability.
> * The number base prefixes ``0x``, ``0o``, and ``0b`` cannot be split up,
>   because they are fixed strings and not logically part of the number.
> * No underscore allowed after a sign in an exponent (``1e-_5``), because
>   underscores can also not be used after the signs in front of the number
>   (``-1e5``).
> * No underscore allowed after a decimal point, because this leads to
> ambiguity
>   with attribute access (the lexer cannot know that there is no number
> literal
>   in ``foo._5``).
>
> There appears to be no reason to restrict the use of underscores otherwise.
>
> The production list for integer literals would therefore look like this::
>
>    integer: decimalinteger | octinteger | hexinteger | bininteger
>    decimalinteger: nonzerodigit [decimalrest] | "0" [("0" | "_")* "0"]
>    nonzerodigit: "1"..."9"
>    decimalrest: (digit | "_")* digit
>    digit: "0"..."9"
>    octinteger: "0" ("o" | "O") (octdigit | "_")* octdigit
>    hexinteger: "0" ("x" | "X") (hexdigit | "_")* hexdigit
>    bininteger: "0" ("b" | "B") (bindigit | "_")* bindigit
>    octdigit: "0"..."7"
>    hexdigit: digit | "a"..."f" | "A"..."F"
>    bindigit: "0" | "1"
>
> For floating-point literals::
>
>    floatnumber: pointfloat | exponentfloat
>    pointfloat: [intpart] fraction | intpart "."
>    exponentfloat: (intpart | pointfloat) exponent
>    intpart: digit (digit | "_")*
>    fraction: "." intpart
>    exponent: ("e" | "E") "_"* ["+" | "-"] digit [decimalrest]
>
>
> Alternative Syntax
> ==================
>
> Underscore Placement Rules
> --------------------------
>
> Instead of the liberal rule specified above, the use of underscores could
> be
> limited.  Common rules are (see the "other languages" section):
>
> * Only one consecutive underscore allowed, and only between digits.
> * Multiple consecutive underscore allowed, but only between digits.
>
> Different Separators
> --------------------
>
> A proposed alternate syntax was to use whitespace for grouping.  Although
> strings are a precedent for combining adjoining literals, the behavior can
> lead
> to unexpected effects which are not possible with underscores.  Also, no
> other
> language is known to use this rule, except for languages that generally
> disregard any whitespace.
>
> C++14 introduces apostrophes for grouping, which is not considered due to
> the
> conflict with Python's string literals. [1]_
>
>
> Behavior in Other Languages
> ===========================
>
> Those languages that do allow underscore grouping implement a large
> variety of
> rules for allowed placement of underscores.  This is a listing placing the
> known
> rules into three major groups.  In cases where the language spec
> contradicts the
> actual behavior, the actual behavior is listed.
>
> **Group 1: liberal (like this PEP)**
>
> * D [2]_
> * Perl 5 (although docs say it's more restricted) [3]_
> * Rust [4]_
> * Swift (although textual description says "between digits") [5]_
>
> **Group 2: only between digits, multiple consecutive underscores**
>
> * C# (open proposal for 7.0) [6]_
> * Java [7]_
>
> **Group 3: only between digits, only one underscore**
>
> * Ada [8]_
> * Julia (but not in the exponent part of floats) [9]_
> * Ruby (docs say "anywhere", in reality only between digits) [10]_
>
>
> Implementation
> ==============
>
> A preliminary patch that implements the specification given above has been
> posted to the issue tracker. [11]_
>

Is the implementation made easier or harder if we went with the Group 2 or
3 approaches? Are there any reasonable examples that the Group 1 approach
allows that Group 3 doesn't that people have used in other languages?

I'm +1 on the idea, but which approach I prefer is going to be partially
dependent on the difficulty of implementing (else I say Group 3 to make it
easier to explain the rules).

-Brett
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to