Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Nick Coghlan
On 19 July 2016 at 08:00, Ethan Furman  wrote:
> On 07/18/2016 02:45 PM, Brett Cannon wrote:
>>
>> On Mon, 18 Jul 2016 at 14:35 Alexander Belopolsky wrote:
>>>
>>> On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble wrote:
>
>
 full(), despite its use in numpy, is also unintuitive to me (my first
 thought is that it would indicate whether an object has room for more
 entries).

 Perhaps bytes.fillsize?
>>>
>>>
>>> I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?
>>
>>
>> Or bytes.fromsize() to stay with the trend of naming constructor methods
>>  as from*() ?
>
>
> bytes.fromsize() sounds good to me, thanks for brainstorming that one for
> me.  I wasn't really happy with 'size()' either.

Heh, I should have finished reading the thread before replying - this
and one of my other comments were already picked up :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Nick Coghlan
(Thanks for moving this forward, Ethan!)

On 19 July 2016 at 06:17, Ethan Furman  wrote:
> * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
> * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
>   ``memoryview.iterbytes`` alternative iterators

As a possible alternative to this aspect, what if we adjusted
memorview.cast() to also support the "s" format code from the struct
module?

At the moment, trying to use "s" gives a value error:

  >>> bview = memoryview(data).cast("s")
  Traceback (most recent call last):
File "", line 1, in 
  ValueError: memoryview: destination format must be a native single
character format prefixed with an optional '@'

However, it could be supported by always interpreting it as equivalent
to "1s", such that the view produced length 1 bytes objects on
indexing and iteration, rather than integers (which is what it does
given the default "b" format).

Given "memoryview(data).cast('s')" as a basic building block, most of
the other aspects of working with bytes objects as if they were Python
2 strings should become relatively straightforward, so the question
would be whether we wanted to make it easy for people to avoid
constructing the mediating memoryview object.

> Proposals
> =
>
> Deprecation of current "zero-initialised sequence" behaviour without removal
> 
>
> Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
> argument and interpret it as meaning to create a zero-initialised sequence
> of the given size::
>
> >>> bytes(3)
> b'\x00\x00\x00'
> >>> bytearray(3)
> bytearray(b'\x00\x00\x00')
>
> This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
> it in place for at least as long as Python 2.7 is supported, possibly
> indefinitely.

I'd suggest being more explicit that this would just be a documented
deprecation, rather than a programmatic deprecatation warning.

> Addition of explicit "count and byte initialised sequence" constructors
> ---
>
> To replace the deprecated behaviour, this PEP proposes the addition of an
> explicit ``size`` alternative constructor as a class method on both
> ``bytes`` and ``bytearray`` whose first argument is the count, and whose
> second argument is the fill byte to use (defaults to ``\x00``)::
>
> >>> bytes.size(3)
> b'\x00\x00\x00'
> >>> bytearray.size(3)
> bytearray(b'\x00\x00\x00')
> >>> bytes.size(5, b'\x0a')
> b'\x0a\x0a\x0a\x0a\x0a'
> >>> bytearray.size(5, b'\x0a')
> bytearray(b'\x0a\x0a\x0a\x0a\x0a')

While I like the notion of having "size" in the name, the
"noun-as-constructor" phrasing doesn't read right to me. Perhaps
"fromsize" for consistency with "fromhex"?

> It will behave just as the current constructors behave when passed a single
> integer.

This last paragraph feels incomplete now, given the expansion to allow
the fill value to be specified.

> Addition of "bchr" function and explicit "single byte" constructors
> ---
>
> As binary counterparts to the text ``chr`` function, this PEP proposes
> the addition of a ``bchr`` function and an explicit ``fromint`` alternative
> constructor as a class method on both ``bytes`` and ``bytearray``::
>
> >>> bchr(ord("A"))
> b'A'
> >>> bchr(ord(b"A"))
> b'A'
> >>> bytes.fromint(65)
> b'A'
> >>> bytearray.fromint(65)
> bytearray(b'A')

Since "fromsize" would also accept an int value, "fromint" feels
ambiguous here. Perhaps "fromord" to emphasise the integer is being
interpreted as an ordinal bytes value, rather than as a size?

The apparent "two ways to do it" here also deserves some additional explanation:

- the bchr builtin is to recreate the ord/chr/unichr trio from Python
2 under a different naming scheme
- the class method is mainly for the "bytearray.fromord" case, with
bytes.fromord added for consistency

[snip sections on accessing elements as bytes object]

> Design discussion
> =
>
> Why not rely on sequence repetition to create zero-initialised sequences?
> -
>
> Zero-initialised sequences can be created via sequence repetition::
>
> >>> b'\x00' * 3
> b'\x00\x00\x00'
> >>> bytearray(b'\x00') * 3
> bytearray(b'\x00\x00\x00')
>
> However, this was also the case when the ``bytearray`` type was originally
> designed, and the decision was made to add explicit support for it in the
> type constructor. The immutable ``bytes`` type then inherited that feature
> when it was introduced in PEP 3137.
>
> This PEP isn't revisiting that original design decision, just changing the
> spelling as users sometimes find the current behaviour of the binary
> sequence
> constructors surprising. In particular, 

Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Brett Cannon
On Mon, 18 Jul 2016 at 15:49 Random832  wrote:

> On Mon, Jul 18, 2016, at 17:34, Alexander Belopolsky wrote:
> > On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble 
> > wrote:
> >
> > > full(), despite its use in numpy, is also unintuitive to me (my first
> > > thought is that it would indicate whether an object has room for more
> > > entries).
> > >
> > > Perhaps bytes.fillsize?
> >
> > I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?
>
> What's wrong with b'\0'*42?
>

It's mentioned in the PEP as to why.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Random832
On Mon, Jul 18, 2016, at 17:34, Alexander Belopolsky wrote:
> On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble 
> wrote:
> 
> > full(), despite its use in numpy, is also unintuitive to me (my first
> > thought is that it would indicate whether an object has room for more
> > entries).
> >
> > Perhaps bytes.fillsize?
> 
> I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?

What's wrong with b'\0'*42?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Alexander Belopolsky
On Mon, Jul 18, 2016 at 6:00 PM, Ethan Furman  wrote:

> bytes.fromsize() sounds good to me, thanks for brainstorming that one for
> me.
>

+1
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Ethan Furman

On 07/18/2016 02:45 PM, Brett Cannon wrote:

On Mon, 18 Jul 2016 at 14:35 Alexander Belopolsky wrote:

On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble wrote:



full(), despite its use in numpy, is also unintuitive to me (my first
thought is that it would indicate whether an object has room for more
entries).

Perhaps bytes.fillsize?


I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?


Or bytes.fromsize() to stay with the trend of naming constructor methods
 as from*() ?


bytes.fromsize() sounds good to me, thanks for brainstorming that one for
me.  I wasn't really happy with 'size()' either.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Ethan Furman

On 07/18/2016 02:01 PM, Jonathan Goble wrote:


This PEP isn't revisiting that original design decision, just changing the
spelling as users sometimes find the current behaviour of the binary
sequence
constructors surprising. In particular, there's a reasonable case to be made
that ``bytes(x)`` (where ``x`` is an integer) should behave like the
``bytes.byte(x)`` proposal in this PEP. Providing both behaviours as
separate
class methods avoids that ambiguity.


You have a leftover bytes.byte here.


Thanks, fixed (plus the other couple locations ;)

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Brett Cannon
On Mon, 18 Jul 2016 at 14:35 Alexander Belopolsky <
alexander.belopol...@gmail.com> wrote:

>
> On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble 
> wrote:
>
>> full(), despite its use in numpy, is also unintuitive to me (my first
>> thought is that it would indicate whether an object has room for more
>> entries).
>>
>> Perhaps bytes.fillsize?
>>
>
> I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?
>

Or bytes.fromsize() to stay with the trend of naming constructor methods as
from*() ?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Alexander Belopolsky
On Mon, Jul 18, 2016 at 5:01 PM, Jonathan Goble  wrote:

> full(), despite its use in numpy, is also unintuitive to me (my first
> thought is that it would indicate whether an object has room for more
> entries).
>
> Perhaps bytes.fillsize?
>

I wouldn't want to see bytes.full() either.  Maybe bytes.of_size()?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Jonathan Goble
*de-lurks*

On Mon, Jul 18, 2016 at 4:45 PM, Alexander Belopolsky
 wrote:
> On Mon, Jul 18, 2016 at 4:17 PM, Ethan Furman  wrote:
>>
>> - 'bytes.zeros' renamed to 'bytes.size', with option byte filler
>>   (defaults to b'\x00')
>
>
> Seriously?  You went from a numpy-friendly feature to something rather
> numpy-hostile.
> In numpy, ndarray.size is an attribute that returns the number of elements
> in the array.
>
> The constructor that creates an arbitrary repeated value also exists and is
> called numpy.full().
>
> Even ignoring numpy, bytes.size(count, value=b'\x00') is completely
> unintuitive.  If I see bytes.size(42) in someone's code, I will think:
> "something like int.bit_length(), but in bytes."

full(), despite its use in numpy, is also unintuitive to me (my first
thought is that it would indicate whether an object has room for more
entries).

Perhaps bytes.fillsize? That would seem the most intuitive to me:
"fill an object of this size with this byte". I'm unfamiliar with
numpy, but a quick Google search suggests that this would not conflict
with anything there, if that is a concern.

> This PEP isn't revisiting that original design decision, just changing the
> spelling as users sometimes find the current behaviour of the binary
> sequence
> constructors surprising. In particular, there's a reasonable case to be made
> that ``bytes(x)`` (where ``x`` is an integer) should behave like the
> ``bytes.byte(x)`` proposal in this PEP. Providing both behaviours as
> separate
> class methods avoids that ambiguity.

You have a leftover bytes.byte here.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: next round

2016-07-18 Thread Alexander Belopolsky
On Mon, Jul 18, 2016 at 4:17 PM, Ethan Furman  wrote:

> - 'bytes.zeros' renamed to 'bytes.size', with option byte filler
>   (defaults to b'\x00')
>

Seriously?  You went from a numpy-friendly feature to something rather
numpy-hostile.
In numpy, ndarray.size is an attribute that returns the number of elements
in the array.

The constructor that creates an arbitrary repeated value also exists and is
called numpy.full().

Even ignoring numpy, bytes.size(count, value=b'\x00') is completely
unintuitive.  If I see bytes.size(42) in someone's code, I will think:
"something like int.bit_length(), but in bytes."
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 467: next round

2016-07-18 Thread Ethan Furman

Taking into consideration the comments from the last round:

- 'bytes.zeros' renamed to 'bytes.size', with option byte filler
  (defaults to b'\x00')
- 'bytes.byte' renamed to 'fromint', add 'bchr' function
- deprecation and removal softened to deprecation/discouragement

---

PEP: 467
Title: Minor API improvements for binary sequences
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan , Ethan Furman 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-03-30
Python-Version: 3.6
Post-History: 2014-03-30 2014-08-15 2014-08-16 2016-06-07


Abstract


During the initial development of the Python 3 language specification, the
core ``bytes`` type for arbitrary binary data started as the mutable type
that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python
3 series.

This PEP proposes five small adjustments to the APIs of the ``bytes``,
``bytearray`` and ``memoryview`` types to make it easier to operate entirely
in the binary domain:

* Deprecate passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.size`` and ``bytearray.size`` alternative constructors
* Add ``bytes.fromint`` and ``bytearray.fromint`` alternative constructors
* Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
  ``memoryview.iterbytes`` alternative iterators


Proposals
=

Deprecation of current "zero-initialised sequence" behaviour without removal


Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence
of the given size::

>>> bytes(3)
b'\x00\x00\x00'
>>> bytearray(3)
bytearray(b'\x00\x00\x00')

This PEP proposes to deprecate that behaviour in Python 3.6, but to leave
it in place for at least as long as Python 2.7 is supported, possibly
indefinitely.

No other changes are proposed to the existing constructors.


Addition of explicit "count and byte initialised sequence" constructors
---

To replace the deprecated behaviour, this PEP proposes the addition of an
explicit ``size`` alternative constructor as a class method on both
``bytes`` and ``bytearray`` whose first argument is the count, and whose
second argument is the fill byte to use (defaults to ``\x00``)::

>>> bytes.size(3)
b'\x00\x00\x00'
>>> bytearray.size(3)
bytearray(b'\x00\x00\x00')
>>> bytes.size(5, b'\x0a')
b'\x0a\x0a\x0a\x0a\x0a'
>>> bytearray.size(5, b'\x0a')
bytearray(b'\x0a\x0a\x0a\x0a\x0a')

It will behave just as the current constructors behave when passed a single
integer.


Addition of "bchr" function and explicit "single byte" constructors
---

As binary counterparts to the text ``chr`` function, this PEP proposes
the addition of a ``bchr`` function and an explicit ``fromint`` alternative
constructor as a class method on both ``bytes`` and ``bytearray``::

>>> bchr(ord("A"))
b'A'
>>> bchr(ord(b"A"))
b'A'
>>> bytes.fromint(65)
b'A'
>>> bytearray.fromint(65)
bytearray(b'A')

These methods will only accept integers in the range 0 to 255 (inclusive)::

>>> bytes.fromint(512)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: integer must be in range(0, 256)

>>> bytes.fromint(1.0)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'float' object cannot be interpreted as an integer

The documentation of the ``ord`` builtin will be updated to explicitly note
that ``bchr`` is the primary inverse operation for binary data, while ``chr``
is the inverse operation for text data, and that ``bytes.fromint`` and
``bytearray.fromint`` also exist.

Behaviourally, ``bytes.fromint(x)`` will be equivalent to the current
``bytes([x])`` (and similarly for ``bytearray``). The new spelling is
expected to be easier to discover and easier to read (especially when used
in conjunction with indexing operations on binary sequence types).

As a separate method, the new spelling will also work better with higher
order functions like ``map``.


Addition of "getbyte" method to retrieve a single byte
--

This PEP proposes that ``bytes`` and ``bytearray`` gain the method ``getbyte``
which will always return ``bytes``::

>>> b'abc'.getbyte(0)
b'a'

If an index is asked for that doesn't exist, ``IndexError`` is raised::

>>> b'abc'.getbyte(9)
Traceback (most recent call last):
  File "", line 1, in 
IndexError: index out of range


Addition of optimised iterator methods that produce ``bytes`` objects