Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-18 Thread Barry Warsaw
On Aug 17, 2014, at 09:39 PM, Antoine Pitrou wrote:

>> need for a special case for a single byte.  We already have a perfectly
>> good spelling:
>> NUL = bytes([0])
>
>That is actually a very cumbersome spelling. Why should I first create a
>one-element list in order to create a one-byte bytes object?

I feel the same way every time I have to write `set(['foo'])`.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Devin Jeanpierre
On Sun, Aug 17, 2014 at 7:14 PM, Alex Gaynor  wrote:
> I've hit basically every problem everyone here has stated, and in no uncertain
> terms am I completely opposed to deprecating anything. The Python 2 to 3
> migration is already hard enough, and already proceeding far too slowly for
> many of our tastes. Making that migration even more complex would drive me to
> the point of giving up.

Could you elaborate what problems you are thinking this will cause for you?

It seems to me that avoiding a bug-prone API is not particularly
complex, and moving it back to its 2.x semantics or making it not work
entirely, rather than making it work differently, would make porting
applications easier. If, during porting to 3.x, you find a deprecation
warning for bytes(n), then rather than being annoying code churny
extra changes, this is actually a bug that's been identified. So it's
helpful even during the deprecation period.

-- Devin
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Alex Gaynor
Donald Stufft  stufft.io> writes:

> 
> 
> 
> For the record I’ve had all of the problems that Nick states and I’m
> +1 on this change.
> 
> 
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
> 

I've hit basically every problem everyone here has stated, and in no uncertain
terms am I completely opposed to deprecating anything. The Python 2 to 3
migration is already hard enough, and already proceeding far too slowly for
many of our tastes. Making that migration even more complex would drive me to
the point of giving up.

Alex

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ian Cordasco
On Sun, Aug 17, 2014 at 8:52 PM, Ethan Furman  wrote:
> On 08/17/2014 04:08 PM, Nick Coghlan wrote:
>>
>>
>> I'm fine with postponing the deprecation elements indefinitely (or just
>> deprecating bytes(int) and leaving
>> bytearray(int) alone).
>
>
> +1 on both pieces.

Perhaps postpone the deprecation to Python 4000 ;)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 04:08 PM, Nick Coghlan wrote:


I'm fine with postponing the deprecation elements indefinitely (or just 
deprecating bytes(int) and leaving
bytearray(int) alone).


+1 on both pieces.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 02:19 PM, Raymond Hettinger wrote:

On Aug 17, 2014, at 11:33 AM, Ethan Furman wrote:


I've had many of the problems Nick states and I'm also +1.


There are two code snippets below which were taken from the standard library.


[...]

My issues are with 'bytes', not 'bytearray'.  'bytearray(10)' actually makes sense.  I certainly have no problem with 
bytearray and bytes not being exactly the same.


My primary issues with bytes is not being able to do b'abc'[2] == b'c', and with not being able to do x = b'abc'[2]; y = 
bytes(x); assert y == b'c'.


And because of the backwards compatibility issues I would deprecate, because we have a new 'better' way, but not remove, 
the current functionality.


I pretty much agree exactly with what Donald Stufft said about it.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 17/08/2014 19:41, Raymond Hettinger a écrit :


The APIs have been around since 2.6 and AFAICT there have been zero
demonstrated
need for a special case for a single byte.  We already have a perfectly
good spelling:
NUL = bytes([0])


That is actually a very cumbersome spelling. Why should I first create a 
one-element list in order to create a one-byte bytes object?



The Zen tells us we really don't need a second way to do it (actually a
third since you
can also write b'\x00') and it suggests that this special case isn't
special enough.


b'\x00' is obviously the right way to do it in this case, but we're 
concerned about the non-constant case.


The reason to instantiate bytes from non-constant integer comes from the 
unfortunate indexing and iteration behaviour of bytes objects.


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 09:41, "Raymond Hettinger" 
wrote:
>
>
> I encourage restraint against adding an unneeded class method that has no
parallel
> elsewhere.  Right now, the learning curve is mitigated because bytes is
very str-like
> and because bytearray is list-like (i.e. the method names have been used
elsewhere
> and likely already learned before encountering bytes() or bytearray()).
 Putting in new,
> rarely used funky method adds to the learning burden.
>
> If you do press forward with adding it (and I don't see why), then as an
alternate
> constructor, the name should be from_int() or some such to avoid ambiguity
> and to make clear that it is a class method.

If I remember the sequence of events correctly, I thought of
map(bytes.byte, data) first, and then Guido suggested a dedicated
iterbytes() method later.

The step I hadn't taken (until now) was realising that the new
memoryview(data).iterbytes() capability actually combines with the existing
(bytes([b]) for b in data) to make the original bytes.byte idea unnecessary.

Cheers,
Nick.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 4:08 PM, Nick Coghlan  wrote:

> Purely deprecating the bytes case and leaving bytearray alone would likely 
> address my concerns.

That is good progress.  Thanks :-)

Would a warning for the bytes case suffice, do you need an actual deprecation?

> bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2 
> had both chr() and unichr().
> 
> I don't recall ever needing chr() in a real program either, but I still 
> consider it an important part of clearly articulating the data model.
> 
> 


"I don't recall having ever needed this"  greatly weakens the premise that this 
is needed :-)

The APIs have been around since 2.6 and AFAICT there have been zero demonstrated
need for a special case for a single byte.  We already have a perfectly good 
spelling:

   NUL = bytes([0])

The Zen tells us we really don't need a second way to do it (actually a third 
since you
can also write b'\x00') and it suggests that this special case isn't special 
enough.

I encourage restraint against adding an unneeded class method that has no 
parallel
elsewhere.  Right now, the learning curve is mitigated because bytes is very 
str-like
and because bytearray is list-like (i.e. the method names have been used 
elsewhere
and likely already learned before encountering bytes() or bytearray()).  
Putting in new,
rarely used funky method adds to the learning burden.

If you do press forward with adding it (and I don't see why), then as an 
alternate 
constructor, the name should be from_int() or some such to avoid ambiguity
and to make clear that it is a class method.

> iterbytes() isn't especially attractive as a method name, but it's far more
> explicit about its purpose.

I concur.  In this case, explicitness matters.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 16/08/2014 01:17, Nick Coghlan a écrit :


* Deprecate passing single integer values to ``bytes`` and ``bytearray``


I'm neutral. Ideally we wouldn't have done that mistake at the beginning.


* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
   ``memoryview.iterbytes`` alternative iterators


+0.5. "iterbytes" isn't really great as a name.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 18 Aug 2014 03:07, "Raymond Hettinger" 
wrote:
>
>
> On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:
>
>> If I see "bytearray(10)" there is nothing there that suggests "this
>> creates an array of length 10 and initialises it to zero" to me. I'd
>> be more inclined to guess it would be equivalent to "bytearray([10])".
>>
>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>> independently of user expectations.
>
>
> Zeros would have been great but that should have been done originally.
> The time to get API design right is at inception.
> Now, you're just breaking code and invalidating any published examples.

I'm fine with postponing the deprecation elements indefinitely (or just
deprecating bytes(int) and leaving bytearray(int) alone).

>
>>>
>>> Another thought is that the core devs should be very reluctant to
deprecate
>>> anything we don't have to while the 2 to 3 transition is still in
progress.
>>> Every new deprecation of APIs that existed in Python 2.7 just adds
another
>>> obstacle to converting code.  Individually, the differences are trivial.
>>> Collectively, they present a good reason to never migrate code to
Python 3.
>>
>>
>> This is actually one of the inconsistencies between the Python 2 and 3
>> binary APIs:
>
>
> However, bytearray(n) is the same in both Python 2 and Python 3.
> Changing it in Python 3 increases the gulf between the two.
>
> The further we let Python 3 diverge from Python 2, the less likely that
> people will convert their code and the harder you make it to write code
> that runs under both.
>
> FWIW, I've been teaching Python full time for three years.  I cover the
> use of bytearray(n) in my classes and not a single person out of 3000+
> engineers have had a problem with it.   I seriously question the PEP's
> assertion that there is a real problem to be solved (i.e. that people
> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
> painful to warrant the headaches that go along with API changes.

Yes, I'd expect engineers and networking folks to be fine with it. It isn't
how this mode of the constructor *works* that worries me, it's how it
*fails* (i.e. silently producing unexpected data rather than a type error).

Purely deprecating the bytes case and leaving bytearray alone would likely
address my concerns.

>
> The other proposal to add bytearray.byte(3) should probably be named
> bytearray.from_byte(3) for clarity.  That said, I question whether there
is
> actually a use case for this.   I have never seen seen code that has a
> need to create a byte array of length one from a single integer.
> For the most part, the API will be easiest to learn if it matches what
> we do for lists and for array.array.

This part of the proposal came from a few things:

* many of the bytes and bytearray methods only accept bytes-like objects,
but iteration and indexing produce integers
* to mitigate the impact of the above, some (but not all) bytes and
bytearray methods now accept integers in addition to bytes-like objects
* ord() in Python 3 is only documented as accepting length 1 strings, but
also accepts length 1 bytes-like objects

Adding bytes.byte() makes it practical to document the binary half of ord's
behaviour, and eliminates any temptation to expand the "also accepts
integers" behaviour out to more types.

bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2
had both chr() and unichr().

I don't recall ever needing chr() in a real program either, but I still
consider it an important part of clearly articulating the data model.

> Sorry Nick, but I think you're making the API worse instead of better.
> This API isn't perfect but it isn't flat-out broken either.   There is
some
> unfortunate asymmetry between bytes() and bytearray() in Python 2,
> but that ship has sailed.  The current API for Python 3 is pretty good
> (though there is still a tension between wanting to be like lists and like
> strings both at the same time).

Yes. It didn't help that the docs previously expected readers to infer the
behaviour of the binary sequence methods from the string documentation -
while the new docs could still use some refinement, I've at least addressed
that part of the problem.

Cheers,
Nick.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Antoine Pitrou


Le 17/08/2014 13:07, Raymond Hettinger a écrit :


FWIW, I've been teaching Python full time for three years.  I cover the
use of bytearray(n) in my classes and not a single person out of 3000+
engineers have had a problem with it.


This is less about bytearray() than bytes(), IMO. bytearray() is 
sufficiently specialized that only experienced people will encounter it.


And while preallocating a bytearray of a certain size makes sense, it's 
completely pointless for a bytes object.


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Donald Stufft

> On Aug 17, 2014, at 5:19 PM, Raymond Hettinger  
> wrote:
> 
> 
> On Aug 17, 2014, at 11:33 AM, Ethan Furman  > wrote:
> 
>> I've had many of the problems Nick states and I'm also +1.
> 
> There are two code snippets below which were taken from the standard library.
> Are you saying that:
> 1) you don't understand the code (as the pep suggests)
> 2) you are willing to break that code and everything like it
> 3) and it would be more elegantly expressed as:  
> charmap = bytearray.zeros(256)
> and
> mapping = bytearray.zeros(256)
> 
> At work, I have network engineers creating IPv4 headers and other structures
> with bytearrays initialized to zeros.  Do you really want to break all their 
> code?
> No where else in Python do we create buffers that way.  Code like
> "msg, who = s.recvfrom(256)" is the norm.
> 
> Also, it is unclear if you're saying that you have an actual use case for this
> part of the proposal?
> 
>ba = bytearray.byte(65)
> 
> And than the code would be better, clearer, and faster than the currently 
> working form?
> 
>ba = bytearray([65])
> 
> Does there really need to be a special case for constructing a single byte?
> To me, that is akin to proposing "list.from_int(65)" as an important special
> case to replace "[65]".
> 
> If you must muck with the ever changing bytes() API, then please 
> leave the bytearray() API alone.  I think we should show some respect
> for code that is currently working and is cleanly expressible in both
> Python 2 and Python 3.  We aren't winning users with API churn.
> 
> FWIW, I guessing that the differing view points in the thread stem
> mainly from the proponents experiences with bytes() rather than
> from experience with bytearray() which doesn't seem to have any
> usage problems in the wild.  I've never seen a developer say they
> didn't understand what "buf = bytearray(1024)" means.   That is
> not an actual problem that needs solving (or breaking).
> 
> What may be an actual problem is code like "char = bytes(1024)"
> though I'm unclear what a user might have actually been trying
> to do with code like that.

I think this is probably correct. I generally don’t think that bytes(1024)
makes much sense at all, especially not as a default constructor. Most likely
it exists to be similar to bytearray().

I don't have a specific problem with bytearray(1024), though I do think it's
more elegantly and clearly described as bytearray.zeros(1024), but not by much.

I find bytes.byte()/bytearray to be needed as long as there isn't a simple way
to iterate over a bytes or bytearray in a way that yields bytes or bytearrays
instead of integers. To be honest I can't think of a time when I'd actually
*want* to iterate over a bytes/bytearray as integers. Although I realize there
is unlikely to be a reasonable method to change that now. If iterbytes is added
I'm not sure where i'd personally use either bytes.byte() or bytearray.byte().

In general though I think that overloading a single constructor method to do
something conceptually different based on the type of the parameter leads to
these kind of confusing scenarios and that having differently named constructors
for the different concepts is far clearer.

So given all that, I am:

* +1 for some method of iterating over both types as bytes instead of
  integers.
* +1 on adding .zeros to both types as an alternative and preferred method of
  creating a zero filled instance and deprecating the original method[1].
* -0 on adding .byte to both types as an alternative method of creating a
  single byte instance.
* -1 On changing the meaning of bytearray(1024).
* +/-0 on changing the meaning of bytes(1024), I think that bytes(1024) is
  likely to *not* be what someone wants and that what they really want is
  bytes([N]). I also think that the number one reason for someone to be doing
  bytes(N) is because they were attempting to iterate over a bytes or bytearray
  object and they got an integer. I also think that it's bad that this changes
  from 2.x to 3.x and I wish it hadn't. However I can't decide if it's worth
  reverting this at this time or not.

[1] By deprecating I mean, raise a deprecation warning, or something but my
thoughts on actually removing the other methods are listed explicitly.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 11:33 AM, Ethan Furman  wrote:

> I've had many of the problems Nick states and I'm also +1.

There are two code snippets below which were taken from the standard library.
Are you saying that:
1) you don't understand the code (as the pep suggests)
2) you are willing to break that code and everything like it
3) and it would be more elegantly expressed as:  
charmap = bytearray.zeros(256)
and
mapping = bytearray.zeros(256)

At work, I have network engineers creating IPv4 headers and other structures
with bytearrays initialized to zeros.  Do you really want to break all their 
code?
No where else in Python do we create buffers that way.  Code like
"msg, who = s.recvfrom(256)" is the norm.

Also, it is unclear if you're saying that you have an actual use case for this
part of the proposal?

   ba = bytearray.byte(65)

And than the code would be better, clearer, and faster than the currently 
working form?

   ba = bytearray([65])

Does there really need to be a special case for constructing a single byte?
To me, that is akin to proposing "list.from_int(65)" as an important special
case to replace "[65]".

If you must muck with the ever changing bytes() API, then please 
leave the bytearray() API alone.  I think we should show some respect
for code that is currently working and is cleanly expressible in both
Python 2 and Python 3.  We aren't winning users with API churn.

FWIW, I guessing that the differing view points in the thread stem
mainly from the proponents experiences with bytes() rather than
from experience with bytearray() which doesn't seem to have any
usage problems in the wild.  I've never seen a developer say they
didn't understand what "buf = bytearray(1024)" means.   That is
not an actual problem that needs solving (or breaking).

What may be an actual problem is code like "char = bytes(1024)"
though I'm unclear what a user might have actually been trying
to do with code like that.


Raymond


--- excerpts from Lib/sre_compile.py ---

charmap = bytearray(256)
for op, av in charset:
while True:
try:
if op is LITERAL:
charmap[fixup(av)] = 1
elif op is RANGE:
for i in range(fixup(av[0]), fixup(av[1])+1):
charmap[i] = 1
elif op is NEGATE:
out.append((op, av))
else:
tail.append((op, av))

...

charmap = bytes(charmap) # should be hashable   
  
comps = {}
mapping = bytearray(256)
block = 0
data = bytearray()
for i in range(0, 65536, 256):
chunk = charmap[i: i + 256]
if chunk in comps:
mapping[i // 256] = comps[chunk]
else:
mapping[i // 256] = comps[chunk] = block
block += 1
data += chunk
data = _mk_bitmap(data)
data[0:0] = [block] + _bytes_to_codes(mapping)
out.append((BIGCHARSET, data))
out += tail
return out___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Ethan Furman

On 08/17/2014 10:16 AM, Donald Stufft wrote:


For the record I’ve had all of the problems that Nick states and I’m
+1 on this change.


I've had many of the problems Nick states and I'm also +1.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Donald Stufft

> On Aug 17, 2014, at 1:07 PM, Raymond Hettinger  
> wrote:
> 
> 
> On Aug 17, 2014, at 1:41 AM, Nick Coghlan  > wrote:
> 
>> If I see "bytearray(10)" there is nothing there that suggests "this
>> creates an array of length 10 and initialises it to zero" to me. I'd
>> be more inclined to guess it would be equivalent to "bytearray([10])".
>> 
>> "bytearray.zeros(10)", on the other hand, is relatively clear,
>> independently of user expectations.
> 
> Zeros would have been great but that should have been done originally.
> The time to get API design right is at inception.
> Now, you're just breaking code and invalidating any published examples.
> 
>>> 
>>> Another thought is that the core devs should be very reluctant to deprecate
>>> anything we don't have to while the 2 to 3 transition is still in progress.
>>> Every new deprecation of APIs that existed in Python 2.7 just adds another
>>> obstacle to converting code.  Individually, the differences are trivial.
>>> Collectively, they present a good reason to never migrate code to Python 3.
>> 
>> This is actually one of the inconsistencies between the Python 2 and 3
>> binary APIs:
> 
> However, bytearray(n) is the same in both Python 2 and Python 3.
> Changing it in Python 3 increases the gulf between the two.
> 
> The further we let Python 3 diverge from Python 2, the less likely that
> people will convert their code and the harder you make it to write code
> that runs under both.
> 
> FWIW, I've been teaching Python full time for three years.  I cover the
> use of bytearray(n) in my classes and not a single person out of 3000+
> engineers have had a problem with it.   I seriously question the PEP's
> assertion that there is a real problem to be solved (i.e. that people
> are baffled by bytearray(bufsiz)) and that the problem is sufficiently
> painful to warrant the headaches that go along with API changes.
> 
> The other proposal to add bytearray.byte(3) should probably be named
> bytearray.from_byte(3) for clarity.  That said, I question whether there is
> actually a use case for this.   I have never seen seen code that has a
> need to create a byte array of length one from a single integer.
> For the most part, the API will be easiest to learn if it matches what
> we do for lists and for array.array.
> 
> Sorry Nick, but I think you're making the API worse instead of better.
> This API isn't perfect but it isn't flat-out broken either.   There is some
> unfortunate asymmetry between bytes() and bytearray() in Python 2,
> but that ship has sailed.  The current API for Python 3 is pretty good
> (though there is still a tension between wanting to be like lists and like
> strings both at the same time).
> 
> 
> Raymond
> 
> 
> P.S.  The most important problem in the Python world now is getting
> Python 2 users to adopt Python 3.  The core devs need to develop
> a strong distaste for anything that makes that problem harder.
> 

For the record I’ve had all of the problems that Nick states and I’m
+1 on this change.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 17, 2014, at 1:41 AM, Nick Coghlan  wrote:

> If I see "bytearray(10)" there is nothing there that suggests "this
> creates an array of length 10 and initialises it to zero" to me. I'd
> be more inclined to guess it would be equivalent to "bytearray([10])".
> 
> "bytearray.zeros(10)", on the other hand, is relatively clear,
> independently of user expectations.

Zeros would have been great but that should have been done originally.
The time to get API design right is at inception.
Now, you're just breaking code and invalidating any published examples.

>> 
>> Another thought is that the core devs should be very reluctant to deprecate
>> anything we don't have to while the 2 to 3 transition is still in progress.
>> Every new deprecation of APIs that existed in Python 2.7 just adds another
>> obstacle to converting code.  Individually, the differences are trivial.
>> Collectively, they present a good reason to never migrate code to Python 3.
> 
> This is actually one of the inconsistencies between the Python 2 and 3
> binary APIs:

However, bytearray(n) is the same in both Python 2 and Python 3.
Changing it in Python 3 increases the gulf between the two.

The further we let Python 3 diverge from Python 2, the less likely that
people will convert their code and the harder you make it to write code
that runs under both.

FWIW, I've been teaching Python full time for three years.  I cover the
use of bytearray(n) in my classes and not a single person out of 3000+
engineers have had a problem with it.   I seriously question the PEP's
assertion that there is a real problem to be solved (i.e. that people
are baffled by bytearray(bufsiz)) and that the problem is sufficiently
painful to warrant the headaches that go along with API changes.

The other proposal to add bytearray.byte(3) should probably be named
bytearray.from_byte(3) for clarity.  That said, I question whether there is
actually a use case for this.   I have never seen seen code that has a
need to create a byte array of length one from a single integer.
For the most part, the API will be easiest to learn if it matches what
we do for lists and for array.array.

Sorry Nick, but I think you're making the API worse instead of better.
This API isn't perfect but it isn't flat-out broken either.   There is some
unfortunate asymmetry between bytes() and bytearray() in Python 2,
but that ship has sailed.  The current API for Python 3 is pretty good
(though there is still a tension between wanting to be like lists and like
strings both at the same time).


Raymond


P.S.  The most important problem in the Python world now is getting
Python 2 users to adopt Python 3.  The core devs need to develop
a strong distaste for anything that makes that problem harder.





___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Nick Coghlan
On 17 August 2014 18:13, Raymond Hettinger  wrote:
>
> On Aug 14, 2014, at 10:50 PM, Nick Coghlan  wrote:
>
> Key points in the proposal:
>
> * deprecate passing integers to bytes() and bytearray()
>
>
> I'm opposed to removing this part of the API.  It has proven useful
> and the alternative isn't very nice.   Declaring the size of fixed length
> arrays is not a new concept and is widely adopted in other languages.
> One principal use case for the bytearray is creating and manipulating
> binary data.  Initializing to zero is common operation and should remain
> part of the core API (consider why we now have list.copy() even though
> copying with a slice remains possible and efficient).

That's why the PEP proposes adding a "zeros" method, based on the name
of the corresponding NumPy construct.

The status quo has some very ugly failure modes when an integer is
passed unexpectedly, and tries to create a large buffer, rather than
throwing a type error.

> I and my clients have taken advantage of this feature and it reads nicely.

If I see "bytearray(10)" there is nothing there that suggests "this
creates an array of length 10 and initialises it to zero" to me. I'd
be more inclined to guess it would be equivalent to "bytearray([10])".

"bytearray.zeros(10)", on the other hand, is relatively clear,
independently of user expectations.

> The proposed deprecation would break our code and not actually make
> anything better.
>
> Another thought is that the core devs should be very reluctant to deprecate
> anything we don't have to while the 2 to 3 transition is still in progress.
> Every new deprecation of APIs that existed in Python 2.7 just adds another
> obstacle to converting code.  Individually, the differences are trivial.
> Collectively, they present a good reason to never migrate code to Python 3.

This is actually one of the inconsistencies between the Python 2 and 3
binary APIs:

Python 2.7.5 (default, Jun 25 2014, 10:19:55)
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> bytes(10)
'10'
>>> bytearray(10)
bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')

Users wanting well-behaved binary sequences in Python 2.7 would be
well advised to use the "future" module to get a full backport of the
actual Python 3 bytes type, rather than the approximation that is the
8-bit str in Python 2. And once they do that, they'll be able to track
the evolution of the Python 3 binary sequence behaviour without any
further trouble.

That said, I don't really mind how long the deprecation cycle is. I'd
be fine with fully supporting both in 3.5 (2015), deprecating the main
constructor in favour of the explicit zeros() method in 3.6 (2017) and
dropping the legacy behaviour in 3.7 (2018)

Regards,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-17 Thread Raymond Hettinger

On Aug 14, 2014, at 10:50 PM, Nick Coghlan  wrote:

> Key points in the proposal:
> 
> * deprecate passing integers to bytes() and bytearray()

I'm opposed to removing this part of the API.  It has proven useful
and the alternative isn't very nice.   Declaring the size of fixed length
arrays is not a new concept and is widely adopted in other languages.
One principal use case for the bytearray is creating and manipulating
binary data.  Initializing to zero is common operation and should remain
part of the core API (consider why we now have list.copy() even though
copying with a slice remains possible and efficient).

I and my clients have taken advantage of this feature and it reads nicely.
The proposed deprecation would break our code and not actually make
anything better.

Another thought is that the core devs should be very reluctant to deprecate
anything we don't have to while the 2 to 3 transition is still in progress.   
Every new deprecation of APIs that existed in Python 2.7 just adds another
obstacle to converting code.  Individually, the differences are trivial.  
Collectively, they present a good reason to never migrate code to Python 3.


Raymond


 

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Nick Coghlan
On 16 August 2014 03:48, Guido van Rossum  wrote:
> This feels chatty. I'd like the PEP to call out the specific proposals and
> put the more verbose motivation later.

I realised that some of that history was actually completely
irrelevant now, so I culled a fair bit of it entirely.

> It took me a long time to realize
> that you don't want to deprecate bytes([1, 2, 3]), but only bytes(3).

I've split out the four subproposals into their own sections, so
hopefully this is clearer now.

> Also
> your mention of bytes.byte() as the counterpart to ord() confused me -- I
> think it's more similar to chr().

This was just a case of me using the wrong word - I meant "inverse"
rather than "counterpart".

> I don't like iterbytes as a builtin, let's
> keep it as a method on affected types.

Done. I also added an explanation of the benefits it offers over the
more generic "map(bytes.byte, data)", as well as more precise
semantics for how it will work with memoryview objects.

New draft is live at http://www.python.org/dev/peps/pep-0467/, as well
as being included inline below.

Regards,
Nick.

===

PEP: 467
Title: Minor API improvements for bytes and bytearray
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 2014-03-30
Python-Version: 3.5
Post-History: 2014-03-30 2014-08-15 2014-08-16


Abstract


During the initial development of the Python 3 language specification, the
core ``bytes`` type for arbitrary binary data started as the mutable type
that is now referred to as ``bytearray``. Other aspects of operating in
the binary domain in Python have also evolved over the course of the Python
3 series.

This PEP proposes four small adjustments to the APIs of the ``bytes``,
``bytearray`` and ``memoryview`` types to make it easier to operate entirely
in the binary domain:

* Deprecate passing single integer values to ``bytes`` and ``bytearray``
* Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors
* Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors
* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and
  ``memoryview.iterbytes`` alternative iterators


Proposals
=

Deprecation of current "zero-initialised sequence" behaviour


Currently, the ``bytes`` and ``bytearray`` constructors accept an integer
argument and interpret it as meaning to create a zero-initialised sequence
of the given size::

>>> bytes(3)
b'\x00\x00\x00'
>>> bytearray(3)
bytearray(b'\x00\x00\x00')

This PEP proposes to deprecate that behaviour in Python 3.5, and remove it
entirely in Python 3.6.

No other changes are proposed to the existing constructors.


Addition of explicit "zero-initialised sequence" constructors
-

To replace the deprecated behaviour, this PEP proposes the addition of an
explicit ``zeros`` alternative constructor as a class method on both
``bytes`` and ``bytearray``::

>>> bytes.zeros(3)
b'\x00\x00\x00'
>>> bytearray.zeros(3)
bytearray(b'\x00\x00\x00')

It will behave just as the current constructors behave when passed a single
integer.

The specific choice of ``zeros`` as the alternative constructor name is taken
from the corresponding initialisation function in NumPy (although, as these
are 1-dimensional sequence types rather than N-dimensional matrices, the
constructors take a length as input rather than a shape tuple)


Addition of explicit "single byte" constructors
---

As binary counterparts to the text ``chr`` function, this PEP proposes the
addition of an explicit ``byte`` alternative constructor as a class method
on both ``bytes`` and ``bytearray``::

>>> bytes.byte(3)
b'\x03'
>>> bytearray.byte(3)
bytearray(b'\x03')

These methods will only accept integers in the range 0 to 255 (inclusive)::

>>> bytes.byte(512)
Traceback (most recent call last):
  File "", line 1, in 
ValueError: bytes must be in range(0, 256)

>>> bytes.byte(1.0)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: 'float' object cannot be interpreted as an integer

The documentation of the ``ord`` builtin will be updated to explicitly note
that ``bytes.byte`` is the inverse operation for binary data, while ``chr``
is the inverse operation for text data.

Behaviourally, ``bytes.byte(x)`` will be equivalent to the current
``bytes([x])`` (and similarly for ``bytearray``). The new spelling is
expected to be easier to discover and easier to read (especially when used
in conjunction with indexing operations on binary sequence types).

As a separate method, the new spelling will also work better with higher
order functions like ``map``.


Addition of optimised iterator methods that produce ``bytes`` objects
--

Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Victor Stinner
2014-08-15 7:50 GMT+02:00 Nick Coghlan :
> As far as I am aware, that last item poses the only open question,
> with the alternative being to add an "iterbytes" builtin (...)

Do you have examples of use cases for a builtin function? I only found
5 usages of bytes((byte,)) constructor in the standard library:

$ grep -E 'bytes\(\([^)]+, *\)\)' $(find -name "*.py")
./Lib/quopri.py:c = bytes((c,))
./Lib/quopri.py:c = bytes((c,))
./Lib/base64.py:b32tab = [bytes((i,)) for i in _b32alphabet]
./Lib/base64.py:_a85chars = [bytes((i,)) for i in range(33, 118)]
./Lib/base64.py:_b85chars = [bytes((i,)) for i in _b85alphabet]

bytes.iterbytes() can be used in 4 cases on 5. Adding a new builtin
for a single line in the whole standard library doesn't look right.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Victor Stinner
2014-08-15 21:54 GMT+02:00 Serhiy Storchaka :
> 15.08.14 08:50, Nick Coghlan написав(ла):
>> * add bytes.zeros() and bytearray.zeros() as a replacement
>
> b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need to
> learn new method. And it works right now.

FYI there is a pending patch for bytearray(int) to use calloc()
instead of malloc(). It's faster for buffer for n larger than 1 MB:
http://bugs.python.org/issue21644

I'm not sure that the optimization is really useful.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Serhiy Storchaka

15.08.14 08:50, Nick Coghlan написав(ла):

* add bytes.zeros() and bytearray.zeros() as a replacement


b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need 
to learn new method. And it works right now.



* add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes()


What are use cases for this? I suppose that main use case may be writing 
the code compatible with 2.7 and 3.x. But in this case you need a 
wrapper (because these types in 2.7 have no the iterbytes() method). And 
how larger would be an advantage of this method over the 
``map(bytes.byte, data)``?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Guido van Rossum
This feels chatty. I'd like the PEP to call out the specific proposals and
put the more verbose motivation later. It took me a long time to realize
that you don't want to deprecate bytes([1, 2, 3]), but only bytes(3). Also
your mention of bytes.byte() as the counterpart to ord() confused me -- I
think it's more similar to chr(). I don't like iterbytes as a builtin,
let's keep it as a method on affected types.


On Thu, Aug 14, 2014 at 10:50 PM, Nick Coghlan  wrote:

> I just posted an updated version of PEP 467 after recently finishing
> the updates to the Python 3.4+ binary sequence docs to decouple them
> from the str docs.
>
> Key points in the proposal:
>
> * deprecate passing integers to bytes() and bytearray()
> * add bytes.zeros() and bytearray.zeros() as a replacement
> * add bytes.byte() and bytearray.byte() as counterparts to ord() for
> binary data
> * add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes()
>
> As far as I am aware, that last item poses the only open question,
> with the alternative being to add an "iterbytes" builtin with a
> definition along the lines of the following:
>
> def iterbytes(data):
> try:
> getiter = type(data).__iterbytes__
> except AttributeError:
> iter = map(bytes.byte, data)
> else:
> iter = getiter(data)
> return iter
>
> Regards,
> Nick.
>
> PEP URL: http://www.python.org/dev/peps/pep-0467/
>
> Full PEP text:
> =
> PEP: 467
> Title: Minor API improvements for bytes and bytearray
> Version: $Revision$
> Last-Modified: $Date$
> Author: Nick Coghlan 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 2014-03-30
> Python-Version: 3.5
> Post-History: 2014-03-30 2014-08-15
>
>
> Abstract
> 
>
> During the initial development of the Python 3 language specification, the
> core ``bytes`` type for arbitrary binary data started as the mutable type
> that is now referred to as ``bytearray``. Other aspects of operating in
> the binary domain in Python have also evolved over the course of the Python
> 3 series.
>
> This PEP proposes a number of small adjustments to the APIs of the
> ``bytes``
> and ``bytearray`` types to make it easier to operate entirely in the binary
> domain.
>
>
> Background
> ==
>
> To simplify the task of writing the Python 3 documentation, the ``bytes``
> and ``bytearray`` types were documented primarily in terms of the way they
> differed from the Unicode based Python 3 ``str`` type. Even when I
> `heavily revised the sequence documentation
> `__ in 2012, I retained
> that
> simplifying shortcut.
>
> However, it turns out that this approach to the documentation of these
> types
> had a problem: it doesn't adequately introduce users to their hybrid
> nature,
> where they can be manipulated *either* as a "sequence of integers" type,
> *or* as ``str``-like types that assume ASCII compatible data.
>
> That oversight has now been corrected, with the binary sequence types now
> being documented entirely independently of the ``str`` documentation in
> `Python 3.4+ <
> https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview
> >`__
>
> The confusion isn't just a documentation issue, however, as there are also
> some lingering design quirks from an earlier pre-release design where there
> was *no* separate ``bytearray`` type, and instead the core ``bytes`` type
> was mutable (with no immutable counterpart).
>
> Finally, additional experience with using the existing Python 3 binary
> sequence types in real world applications has suggested it would be
> beneficial to make it easier to convert integers to length 1 bytes objects.
>
>
> Proposals
> =
>
> As a "consistency improvement" proposal, this PEP is actually about a few
> smaller micro-proposals, each aimed at improving the usability of the
> binary
> data model in Python 3. Proposals are motivated by one of two main factors:
>
> * removing remnants of the original design of ``bytes`` as a mutable type
> * allowing users to easily convert integer values to a length 1 ``bytes``
>   object
>
>
> Alternate Constructors
> --
>
> The ``bytes`` and ``bytearray`` constructors currently accept an integer
> argument, but interpret it to mean a zero-filled object of the given
> length.
> This is a legacy of the original design of ``bytes`` as a mutable type,
> rather than a particularly intuitive behaviour for users. It has become
> especially confusing now that some other ``bytes`` interfaces treat
> integers
> and the corresponding length 1 bytes instances as equivalent input.
> Compare::
>
> >>> b"\x03" in bytes([1, 2, 3])
> True
> >>> 3 in bytes([1, 2, 3])
> True
>
> >>> bytes(b"\x03")
> b'\x03'
> >>> bytes(3)
> b'\x00\x00\x00'
>
> This PEP proposes that the current handling of integers in the bytes and
> byte