[Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Georg Brandl
Hi all,

after talking to Guido and Serhiy we present the next revision
of this PEP.  It is a compromise that we are all happy with,
and a relatively restricted rule that makes additions to PEP 8
basically unnecessary.

I think the discussion has shown that supporting underscores in
the from-string constructors is valuable, therefore this is now
added to the specification section.

The remaining open question is about the reverse direction: do
we want a string formatting modifier that adds underscores as
thousands separators?

cheers,
Georg

-

PEP: 515
Title: Underscores in Numeric Literals
Version: $Revision$
Last-Modified: $Date$
Author: Georg Brandl, Serhiy Storchaka
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 10-Feb-2016
Python-Version: 3.6
Post-History: 10-Feb-2016, 11-Feb-2016

Abstract and Rationale
==

This PEP proposes to extend Python's syntax and number-from-string
constructors so that underscores can be used as visual separators for
digit grouping purposes in integral, floating-point and complex number
literals.

This is a common feature of other modern languages, and can aid
readability of long literals, or literals whose value should clearly
separate into parts, such as bytes or words in hexadecimal notation.

Examples::

# grouping decimal numbers by thousands
amount = 10_000_000.0

# grouping hexadecimal addresses by words
addr = 0xDEAD_BEEF

# grouping bits into nibbles in a binary literal
flags = 0b_0011__0100_1110

# same, for string conversions
flags = int('0b__', 2)


Specification
=

The current proposal is to allow one underscore between digits, and
after base specifiers in numeric literals.  The underscores have no
semantic meaning, and literals are parsed as if the underscores were
absent.

Literal Grammar
---

The production list for integer literals would therefore look like
this::

   integer: decinteger | bininteger | octinteger | hexinteger
   decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
   bininteger: "0" ("b" | "B") (["_"] bindigit)+
   octinteger: "0" ("o" | "O") (["_"] octdigit)+
   hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
   nonzerodigit: "1"..."9"
   digit: "0"..."9"
   bindigit: "0" | "1"
   octdigit: "0"..."7"
   hexdigit: digit | "a"..."f" | "A"..."F"

For floating-point and complex literals::

   floatnumber: pointfloat | exponentfloat
   pointfloat: [digitpart] fraction | digitpart "."
   exponentfloat: (digitpart | pointfloat) exponent
   digitpart: digit (["_"] digit)*
   fraction: "." digitpart
   exponent: ("e" | "E") ["+" | "-"] digitpart
   imagnumber: (floatnumber | digitpart) ("j" | "J")

Constructors


Following the same rules for placement, underscores will be allowed in
the following constructors:

- ``int()`` (with any base)
- ``float()``
- ``complex()``
- ``Decimal()``


Prior Art
=

Those languages that do allow underscore grouping implement a large
variety of rules for allowed placement of underscores.  In cases where
the language spec contradicts the actual behavior, the actual behavior
is listed.  ("single" or "multiple" refer to allowing runs of
consecutive underscores.)

* Ada: single, only between digits [8]_
* C# (open proposal for 7.0): multiple, only between digits [6]_
* C++14: single, between digits (different separator chosen) [1]_
* D: multiple, anywhere, including trailing [2]_
* Java: multiple, only between digits [7]_
* Julia: single, only between digits (but not in float exponent parts)
  [9]_
* Perl 5: multiple, basically anywhere, although docs say it's
  restricted to one underscore between digits [3]_
* Ruby: single, only between digits (although docs say "anywhere")
  [10]_
* Rust: multiple, anywhere, except for between exponent "e" and digits
  [4]_
* Swift: multiple, between digits and trailing (although textual
  description says only "between digits") [5]_


Alternative Syntax
==

Underscore Placement Rules
--

Instead of the relatively strict rule specified above, the use of
underscores could be limited.  As we seen from other languages, common
rules include:

* Only one consecutive underscore allowed, and only between digits.
* Multiple consecutive underscores allowed, but only between digits.
* Multiple consecutive underscores allowed, in most positions except
  for the start of the literal, or special positions like after a
  decimal point.

The syntax in this PEP has ultimately been selected because it covers
the common use cases, and does not allow for syntax that would have to
be discouraged in style guides anyway.

A less common rule would be to allow underscores only every N digits
(where N could be 3 for decimal literals, or 4 for hexadecimal ones).
This is unnecessarily restrictive, especially considering the
separator placement is different in different cultures

Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Serhiy Storchaka

On 13.02.16 10:48, Georg Brandl wrote:

Following the same rules for placement, underscores will be allowed in
the following constructors:

- ``int()`` (with any base)
- ``float()``
- ``complex()``
- ``Decimal()``


What about float.fromhex()? Should underscores be allowed in it (I think 
no)?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Georg Brandl
On 02/13/2016 12:10 PM, Serhiy Storchaka wrote:
> On 13.02.16 10:48, Georg Brandl wrote:
>> Following the same rules for placement, underscores will be allowed in
>> the following constructors:
>>
>> - ``int()`` (with any base)
>> - ``float()``
>> - ``complex()``
>> - ``Decimal()``
> 
> What about float.fromhex()? Should underscores be allowed in it (I think 
> no)?

Good question.  It *does* accept a "0x" prefix, as does ``int(x, 16)``, so
there is some precedent for literal-like interpretation of the input here
as well.

Georg


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Steven D'Aprano
On Sat, Feb 13, 2016 at 09:48:49AM +0100, Georg Brandl wrote:
> Hi all,
> 
> after talking to Guido and Serhiy we present the next revision
> of this PEP.  It is a compromise that we are all happy with,
> and a relatively restricted rule that makes additions to PEP 8
> basically unnecessary.
> 
> I think the discussion has shown that supporting underscores in
> the from-string constructors is valuable, therefore this is now
> added to the specification section.

What about Fraction? Currently this is legal:

py> Fraction("1/100")
Fraction(1, 100)


I think the PEP should also support underscores in Fractions:

Fraction("1/1_000_000")


> The remaining open question is about the reverse direction: do
> we want a string formatting modifier that adds underscores as
> thousands separators?

Yes please.


> Open Proposals
> ==
> 
> It has been proposed [11]_ to extend the number-to-string formatting
> language to allow ``_`` as a thousans separator, where currently only
> ``,`` is supported.  This could be used to easily generate code with
> more readable literals.

/s/thousans/thousands/



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Glenn Linderman

On 2/13/2016 12:48 AM, Georg Brandl wrote:

Instead of the relatively strict rule specified above, the use of
underscores could be limited.

This sentence doesn't really make sense.

Either s/limited/more limited/
or s/limited/further limited/
or s/limited/relaxed/

Maybe the whole section should be reworded.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Ethan Furman

On 02/13/2016 12:48 AM, Georg Brandl wrote:


The remaining open question is about the reverse direction: do
we want a string formatting modifier that adds underscores as
thousands separators?


+0  Would be nice, but also wouldn't make much sense in other groupings.



Instead of the relatively strict rule specified above, the use of
underscores could be limited.  As we seen from other languages, common
rules include:


s/seen/see  or  s/we//

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-02-13 Thread Brett Cannon
On Sat, Feb 13, 2016, 00:49 Georg Brandl  wrote:

> Hi all,
>
> after talking to Guido and Serhiy we present the next revision
> of this PEP.  It is a compromise that we are all happy with,
> and a relatively restricted rule that makes additions to PEP 8
> basically unnecessary.
>

+1 from me.


> I think the discussion has shown that supporting underscores in
> the from-string constructors is valuable, therefore this is now
> added to the specification section.
>
> The remaining open question is about the reverse direction: do
> we want a string formatting modifier that adds underscores as
> thousands separators?
>

+0

Brett


> cheers,
> Georg
>
> -
>
> PEP: 515
> Title: Underscores in Numeric Literals
> Version: $Revision$
> Last-Modified: $Date$
> Author: Georg Brandl, Serhiy Storchaka
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 10-Feb-2016
> Python-Version: 3.6
> Post-History: 10-Feb-2016, 11-Feb-2016
>
> Abstract and Rationale
> ==
>
> This PEP proposes to extend Python's syntax and number-from-string
> constructors so that underscores can be used as visual separators for
> digit grouping purposes in integral, floating-point and complex number
> literals.
>
> This is a common feature of other modern languages, and can aid
> readability of long literals, or literals whose value should clearly
> separate into parts, such as bytes or words in hexadecimal notation.
>
> Examples::
>
> # grouping decimal numbers by thousands
> amount = 10_000_000.0
>
> # grouping hexadecimal addresses by words
> addr = 0xDEAD_BEEF
>
> # grouping bits into nibbles in a binary literal
> flags = 0b_0011__0100_1110
>
> # same, for string conversions
> flags = int('0b__', 2)
>
>
> Specification
> =
>
> The current proposal is to allow one underscore between digits, and
> after base specifiers in numeric literals.  The underscores have no
> semantic meaning, and literals are parsed as if the underscores were
> absent.
>
> Literal Grammar
> ---
>
> The production list for integer literals would therefore look like
> this::
>
>integer: decinteger | bininteger | octinteger | hexinteger
>decinteger: nonzerodigit (["_"] digit)* | "0" (["_"] "0")*
>bininteger: "0" ("b" | "B") (["_"] bindigit)+
>octinteger: "0" ("o" | "O") (["_"] octdigit)+
>hexinteger: "0" ("x" | "X") (["_"] hexdigit)+
>nonzerodigit: "1"..."9"
>digit: "0"..."9"
>bindigit: "0" | "1"
>octdigit: "0"..."7"
>hexdigit: digit | "a"..."f" | "A"..."F"
>
> For floating-point and complex literals::
>
>floatnumber: pointfloat | exponentfloat
>pointfloat: [digitpart] fraction | digitpart "."
>exponentfloat: (digitpart | pointfloat) exponent
>digitpart: digit (["_"] digit)*
>fraction: "." digitpart
>exponent: ("e" | "E") ["+" | "-"] digitpart
>imagnumber: (floatnumber | digitpart) ("j" | "J")
>
> Constructors
> 
>
> Following the same rules for placement, underscores will be allowed in
> the following constructors:
>
> - ``int()`` (with any base)
> - ``float()``
> - ``complex()``
> - ``Decimal()``
>
>
> Prior Art
> =
>
> Those languages that do allow underscore grouping implement a large
> variety of rules for allowed placement of underscores.  In cases where
> the language spec contradicts the actual behavior, the actual behavior
> is listed.  ("single" or "multiple" refer to allowing runs of
> consecutive underscores.)
>
> * Ada: single, only between digits [8]_
> * C# (open proposal for 7.0): multiple, only between digits [6]_
> * C++14: single, between digits (different separator chosen) [1]_
> * D: multiple, anywhere, including trailing [2]_
> * Java: multiple, only between digits [7]_
> * Julia: single, only between digits (but not in float exponent parts)
>   [9]_
> * Perl 5: multiple, basically anywhere, although docs say it's
>   restricted to one underscore between digits [3]_
> * Ruby: single, only between digits (although docs say "anywhere")
>   [10]_
> * Rust: multiple, anywhere, except for between exponent "e" and digits
>   [4]_
> * Swift: multiple, between digits and trailing (although textual
>   description says only "between digits") [5]_
>
>
> Alternative Syntax
> ==
>
> Underscore Placement Rules
> --
>
> Instead of the relatively strict rule specified above, the use of
> underscores could be limited.  As we seen from other languages, common
> rules include:
>
> * Only one consecutive underscore allowed, and only between digits.
> * Multiple consecutive underscores allowed, but only between digits.
> * Multiple consecutive underscores allowed, in most positions except
>   for the start of the literal, or special positions like after a
>   decimal point.
>
> The syntax in this PEP has ultimately been selected because it covers
> the common use cases, a