Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 06/07/2016 02:34 PM, Koos Zevenhoven wrote: Why not bytes.viewbytes (or whatever name) so that one could also subscript it? And if it were a property, one could perhaps conveniently get the n'th byte: b'abcde'.viewbytes[n] # compared to b'abcde'[n:n+1] AFAICT, 'viewbytes' doesn't add much over bytes itself if we add a 'getbyte' method. Also, would it not be more clear to call the int -> bytes method something like bytes.fromint or bytes.fromord and introduce the same thing on str? And perhaps allow multiple arguments to create a str/bytes of length > 1. I guess this may violate TOOWTDI, but anyway, just a thought. Yes, it would. Changing to 'bytes.fromint'. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 06/07/2016 10:42 PM, Serhiy Storchaka wrote: On 07.06.16 23:28, Ethan Furman wrote: * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and ``memoryview.iterbytes`` alternative iterators "Byte" is an alias to "octet" (8-bit integer) in modern terminology. Maybe so, but not, to my knowledge, in Python terminology. Iterating bytes and bytearray already produce bytes. No, it produces integers: for b in b'abcid': ... print(b) ... 97 98 99 105 100 -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 9 June 2016 at 19:21, Barry Warsawwrote: > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > >>Deprecation of current "zero-initialised sequence" behaviour >> >> >>Currently, the ``bytes`` and ``bytearray`` constructors accept an integer >>argument and interpret it as meaning to create a zero-initialised sequence of >>the given size:: >> >> >>> bytes(3) >> b'\x00\x00\x00' >> >>> bytearray(3) >> bytearray(b'\x00\x00\x00') >> >>This PEP proposes to deprecate that behaviour in Python 3.6, and remove it >>entirely in Python 3.7. >> >>No other changes are proposed to the existing constructors. > > Does it need to be *actually* removed? That does break existing code for not > a lot of benefit. Yes, the default constructor is a little wonky, but with > the addition of the new constructors, and the fact that you're not proposing > to eventually change the default constructor, removal seems unnecessary. > Besides, once it's removed, what would `bytes(3)` actually do? The PEP > doesn't say. Raise TypeError, presumably. However, I agree this isn't worth the hassle of breaking working code, especially since truly ludicrous values will fail promptly with MemoryError - it's only a particular range of values that fit within the limits of the machine, but also push it into heavy swapping that are a potential problem. > Also, since you're proposing to add `bytes.byte(3)` have you considered also > adding an optional count argument? E.g. `bytes.byte(3, count=7)` would yield > b'\x03\x03\x03\x03\x03\x03\x03'. That seems like it could be useful. The purpose of bytes.byte() in the PEP is to provide a way to roundtrip ord() calls with binary inputs, since the current spelling is pretty unintuitive: >>> ord("A") 65 >>> chr(ord("A")) 'A' >>> ord(b"A") 65 >>> bytes([ord(b"A")]) b'A' That said, perhaps it would make more sense for the corresponding round-trip to be: >>> bchr(ord("A")) b'A' With the "b" prefix on "chr" reflecting the "b" prefix on the output. This also inverts the chr/unichr pairing that existed in Python 2 (replacing it with bchr/chr), and is hence very friendly to compatibility modules like six and future (future.builtins already provides a chr that behaves like the Python 3 one, and bchr would be much easier to add to that than a new bytes object method). In terms of an efficient memory-preallocation interface, the equivalent NumPy operation to request a pre-filled array is "ndarray.full": http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.full.html (there's also an inplace mutation operation, "fill") For bytes and bytearray though, that has an unfortunate name collision with "zfill", which refers to zero-padding numeric values for fixed width display. If the PEP just added bchr() to complement chr(), and [bytes, bytearray].zeros() as a more discoverable alternative to passing integers to the default constructor, I think that would be a decent step forward, and the question of pre-initialising with arbitrary values can be deferred for now (and perhaps left to NumPy indefinitely) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Jun 8, 2016 8:13 AM, "Paul Sokolovsky"wrote: > > Hello, > > On Wed, 8 Jun 2016 14:45:22 +0300 > Serhiy Storchaka wrote: > > [] > > > > $ ./run-bench-tests bench/bytealloc* > > > bench/bytealloc: > > > 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py > > > 11.244s (+237.35%) bench/bytealloc-2-repeat.py > > > > If the performance of creating an immutable array of n zero bytes is > > important in MicroPython, it is worth to optimize b"\0" * n. > > No matter how you optimize calloc + something, it's always slower than > just calloc. `bytes(n)` *is* calloc + something. It's a lookup of and call to a global function. (Unless MicroPython optimizes away lookups for builtins, in which case it can theoretically optimize b"\0".__mul__.) On the other hand, b"\0" is a constant, and * is an operator lookup that succeeds on the first argument (meaning, perhaps, a successful branch prediction). As a constant, it is only created once, so there's no intermediate object created. AFAICT, the first requires optimizing global function lookups + calls, and the second requires optimizing lookup and *successful* application of __mul__ (versus failure + fallback to some __rmul__), and repetitions of a particular `bytes` object (which can be interned and checked against). That means there is room for either to win, depending on the efforts of the implementers. (However, `bytearray` has no syntax for literals (and therefore easy constants), and is a more valid and, AFAIK, more practical concern.) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Wed, Jun 08, 2016 at 10:04:08AM +0200, Victor Stinner wrote: > It's common that users complain that Python core developers like > breaking the compatibility at each release. No more common as users complaining that Python features are badly designed and crufty and should be fixed. Whatever we do, we can't win. If we fix misfeatures, people complain. If we don't fix them, people complain. Sometimes the same people, depending on their specific needs. "Fix this, because it annoys me, but don't fix that, because I'm used to it and it doesn't annoy me any more." *shrug* Ultimately it comes down to a subjective feeling as to which is worse. My own subjective feeling is that, in the long run, we'll be better off fixing bytes than keeping it, and the longer we wait to fix it, the harder it will be. -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Jun 08, 2016, at 02:01 AM, Martin Panter wrote: >Bytes.byte() is a great idea. But what’s the point or use case of >bytearray.byte(), a mutable array of one pre-defined byte? I like Bytes.byte() too. I would guess you'd want the same method on bytearray for duck typing APIs. -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hello, On Wed, 8 Jun 2016 14:45:22 +0300 Serhiy Storchakawrote: [] > > $ ./run-bench-tests bench/bytealloc* > > bench/bytealloc: > > 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py > > 11.244s (+237.35%) bench/bytealloc-2-repeat.py > > If the performance of creating an immutable array of n zero bytes is > important in MicroPython, it is worth to optimize b"\0" * n. No matter how you optimize calloc + something, it's always slower than just calloc. > For now CPython is the main implementation of Python 3 Indeed, and it already has bytes(N). So, perhaps nothing should be done about it except leaving it alone. Perhaps, more discussion should go into whether there's need for .iterbytes() if there's [i:i+1] already. (I personally skip that, as I find [i:i+1] perfectly ok, and while I can't understand how people may be not ok with it up to wanting something more, I leave such possibility). > and bytes(n) > is slower than b"\0" * n in CPython. -- Best regards, Paul mailto:pmis...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 08.06.16 14:26, Paul Sokolovsky wrote: On Wed, 8 Jun 2016 14:05:19 +0300 Serhiy Storchakawrote: On 08.06.16 13:37, Paul Sokolovsky wrote: The obvious way to create the bytes object of length n is b'\0' * n. That's very inefficient: it requires allocating useless b'\0', then a generic function to repeat arbitrary memory block N times. If there's a talk of Python to not be laughed at for being SLOW, there would rather be efficient ways to deal with blocks of binary data. Do you have any evidences for this claim? Yes, it's written above, let me repeat it: bytes(n) is (can be) calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm. $ ./python -m timeit -s 'n = 1' -- 'bytes(n)' 100 loops, best of 3: 1.32 usec per loop $ ./python -m timeit -s 'n = 1' -- 'b"\0" * n' 100 loops, best of 3: 0.858 usec per loop I don't know how inefficient CPython's bytes(n) or how efficient repetition (maybe 1-byte repetitions are optimized into memset()?), but MicroPython (where bytes(n) is truly calloc(n)) gives expected results: $ ./run-bench-tests bench/bytealloc* bench/bytealloc: 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py 11.244s (+237.35%) bench/bytealloc-2-repeat.py If the performance of creating an immutable array of n zero bytes is important in MicroPython, it is worth to optimize b"\0" * n. For now CPython is the main implementation of Python 3 and bytes(n) is slower than b"\0" * n in CPython. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hello, On Wed, 8 Jun 2016 14:05:19 +0300 Serhiy Storchakawrote: > On 08.06.16 13:37, Paul Sokolovsky wrote: > >> The obvious way to create the bytes object of length n is b'\0' * > >> n. > > > > That's very inefficient: it requires allocating useless b'\0', then > > a generic function to repeat arbitrary memory block N times. If > > there's a talk of Python to not be laughed at for being SLOW, there > > would rather be efficient ways to deal with blocks of binary data. > > Do you have any evidences for this claim? Yes, it's written above, let me repeat it: bytes(n) is (can be) calloc(1, n) underlyingly, while b"\0" * n is a more complex algorithm. > > $ ./python -m timeit -s 'n = 1' -- 'bytes(n)' > 100 loops, best of 3: 1.32 usec per loop > $ ./python -m timeit -s 'n = 1' -- 'b"\0" * n' > 100 loops, best of 3: 0.858 usec per loop I don't know how inefficient CPython's bytes(n) or how efficient repetition (maybe 1-byte repetitions are optimized into memset()?), but MicroPython (where bytes(n) is truly calloc(n)) gives expected results: $ ./run-bench-tests bench/bytealloc* bench/bytealloc: 3.333s (+00.00%) bench/bytealloc-1-bytes_n.py 11.244s (+237.35%) bench/bytealloc-2-repeat.py -- Best regards, Paul mailto:pmis...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 08.06.16 13:37, Paul Sokolovsky wrote: The obvious way to create the bytes object of length n is b'\0' * n. That's very inefficient: it requires allocating useless b'\0', then a generic function to repeat arbitrary memory block N times. If there's a talk of Python to not be laughed at for being SLOW, there would rather be efficient ways to deal with blocks of binary data. Do you have any evidences for this claim? $ ./python -m timeit -s 'n = 1' -- 'bytes(n)' 100 loops, best of 3: 1.32 usec per loop $ ./python -m timeit -s 'n = 1' -- 'b"\0" * n' 100 loops, best of 3: 0.858 usec per loop ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hello, On Wed, 8 Jun 2016 11:53:06 +0300 Serhiy Storchakawrote: > On 08.06.16 11:04, Victor Stinner wrote: > >> Currently, the ``bytes`` and ``bytearray`` constructors accept an > >> integer argument and interpret it as meaning to create a > >> zero-initialised sequence of the given size:: > >> (...) > >> This PEP proposes to deprecate that behaviour in Python 3.6, and > >> remove it entirely in Python 3.7. > > > > I'm opposed to this change (presented like that). Please stop > > breaking the backward compatibility in minor versions. > > The argument for deprecating bytes(n) is that this has different > meaning in Python 2, That's artifact (as in: defect) of "bytes" (apparently) being a flat alias of "str" in Python2, without trying to validate its arguments. It would be sad if thinkos in Python2 implementation dictate how Python3 should work. It's not too late to fix it in Python2 by issuing s CVE along the lines of "Lack of argument validation in Python2 bytes() constructor may lead to insecure code." > and when backport a code to Python 2 or write > 2+3 compatible code there is a risk to make a mistake. This argument > is not applicable to bytearray(n). > > > *If* you still want to deprecate bytes(n), you must introduce an > > helper working on *all* Python versions. Obviously, the helper must > > be avaialble and work for Python 2.7. Maybe it can be the six > > module. Maybe something else. > > The obvious way to create the bytes object of length n is b'\0' * n. That's very inefficient: it requires allocating useless b'\0', then a generic function to repeat arbitrary memory block N times. If there's a talk of Python to not be laughed at for being SLOW, there would rather be efficient ways to deal with blocks of binary data. > It works in all Python versions starting from 2.6. I don't see the > need in bytes(n) and bytes.zeros(n). There are no special methods for > creating a list or a string of size n. So, above, unless you specifically mean having bytearray.zero() and not having bytes.zero(). But then the whole purpose of the presented PEP is make API more, not less consistent. Having random gaps in bytes vs bytearray API isn't going to help anyone. -- Best regards, Paul mailto:pmis...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 08.06.16 02:03, Nick Coghlan wrote: That said, it occurs to me that there's a reasonably strong composability argument in favour of a view-based approach: a view will work with operator.itemgetter() and other sequence consuming APIs, while special methods won't. The "like-memoryview-but-not" view type could also take any bytes-like object as input, similar to memoryview itself. Something like: class chunks: def __init__(self, seq, size): self._seq = seq self._size = size def __len__(self): return len(self._seq) // self._size def __getitem__(self, i): chunk = self._seq[i: i + self._size] if len(chunk) != self._size: raise IndexError return chunk (but needs more checks and slices support). It would be useful for general sequences too. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 08.06.16 11:04, Victor Stinner wrote: Currently, the ``bytes`` and ``bytearray`` constructors accept an integer argument and interpret it as meaning to create a zero-initialised sequence of the given size:: (...) This PEP proposes to deprecate that behaviour in Python 3.6, and remove it entirely in Python 3.7. I'm opposed to this change (presented like that). Please stop breaking the backward compatibility in minor versions. The argument for deprecating bytes(n) is that this has different meaning in Python 2, and when backport a code to Python 2 or write 2+3 compatible code there is a risk to make a mistake. This argument is not applicable to bytearray(n). *If* you still want to deprecate bytes(n), you must introduce an helper working on *all* Python versions. Obviously, the helper must be avaialble and work for Python 2.7. Maybe it can be the six module. Maybe something else. The obvious way to create the bytes object of length n is b'\0' * n. It works in all Python versions starting from 2.6. I don't see the need in bytes(n) and bytes.zeros(n). There are no special methods for creating a list or a string of size n. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hi, > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer > argument and interpret it as meaning to create a zero-initialised sequence > of the given size:: > (...) > This PEP proposes to deprecate that behaviour in Python 3.6, and remove it > entirely in Python 3.7. I'm opposed to this change (presented like that). Please stop breaking the backward compatibility in minor versions. I'm porting Python 2 code to Python 3 for longer than 2 years. First, Python 3 only proposed to immediatly drop Python 2 support using the 2to3 tool. It simply doesn't work because you must port incrementally all dependencies, so you must write code working with Python 2 and Python 3 using the same code base. A few people tried to duplicate repositories, projects, project name, etc. to have one version for Python 2 and one version for Python 3, but IMHO it's even worse. It's very difficult to handle dependencies using that. It took a few years until six was widely used and that pip was popular enough to be able to add six as a *dependency* (and not put an old copy in the project). Basically, you propose to introduce a backward incompatible change for free (I fail to see the benefit of replacing bytes(n) with bytes.zeros(n)) and without obvious way to write code compatible with Python <= 3.6 and Python >= 3.7. Moreover, a single cycle is way too short to port all code in the wild. It's common that users complain that Python core developers like breaking the compatibility at each release. Recently, I saw a list of applications which need to be ported to Python 3.5, while they work perfectly on Python 3.4. *If* you still want to deprecate bytes(n), you must introduce an helper working on *all* Python versions. Obviously, the helper must be avaialble and work for Python 2.7. Maybe it can be the six module. Maybe something else. In Perl 5, there is a nice "use 5.12;" pragma to explicitly ask to keep the compatiiblity with Perl 5.12. This pragma allows to change the language more easily, since you can port code file by file. I don't know if it's technically possible in Python, maybe not for all kinds of backward incompatible changes. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Ethan Furman writes: > * Deprecate passing single integer values to ``bytes`` and > ``bytearray`` Why? This is a slightly awkward idiom compared to .zeros (EITBI etc), but your 32-bit clock will roll over before we can actually remove it. There are a lot of languages that do this kind of initialization of arrays based on ``count``. If you want to do something useful here, add an optional argument (here in ridiculous :-) generality: bytes(count, tile=[0]) -> bytes(tile * count) where ``tile`` is a Sequence of a type that is acceptable to bytes anyway, or Sequence[int], which is treated as b"".join([bytes(chr(i)) for i in tile] * count]) Interpretation of ``count`` of course i bikesheddable, with at least one alternative interpretation (length of result bytes, with last tile truncated if necessary). > * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors this is an API break if you take the deprecation as a mandate (which eventual removal does indicate). And backward compatibility for clients of the bytes API means that we violate TOOWTDI indefinitely, on a constructor of quite specialized utility. Yuck. -1 on both. Barry Warsaw writes later in thread: > We can't change bytes.__getitem__ but we can add another method > that returns single byte objects? I think it's still a bit of a > pain to extract single bytes even with .iterbytes(). +1 ISTM that more than the other changes, this is the most important one. Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 07.06.16 23:28, Ethan Furman wrote: Minor changes: updated version numbers, add punctuation. The current text seems to take into account Guido's last comments. Thoughts before asking for acceptance? PEP: 467 Title: Minor API improvements for binary sequences Version: $Revision$ Last-Modified: $Date$ Author: Nick CoghlanStatus: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.5 Post-History: 2014-03-30 2014-08-15 2014-08-16 Abstract During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes four small adjustments to the APIs of the ``bytes``, ``bytearray`` and ``memoryview`` types to make it easier to operate entirely in the binary domain: * Deprecate passing single integer values to ``bytes`` and ``bytearray`` * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and ``memoryview.iterbytes`` alternative iterators "Byte" is an alias to "octet" (8-bit integer) in modern terminology. Iterating bytes and bytearray already produce bytes. Wouldn't this be confused? May be name these methods "iterbytestrings", since they adds str-like behavior? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 7 June 2016 at 21:56, Nick Coghlanwrote: > On 7 June 2016 at 14:33, Paul Sokolovsky wrote: >> Ethan Furman wrote: >>> Deprecation of current "zero-initialised sequence" behaviour >>> >>> >>> Currently, the ``bytes`` and ``bytearray`` constructors accept an >>> integer argument and interpret it as meaning to create a >>> zero-initialised sequence of the given size:: >>> >>> >>> bytes(3) >>> b'\x00\x00\x00' >>> >>> bytearray(3) >>> bytearray(b'\x00\x00\x00') >>> >>> This PEP proposes to deprecate that behaviour in Python 3.6, and >>> remove it entirely in Python 3.7. >> >> Why the desire to break applications of thousands and thousands of >> people? > > Same argument as any deprecation: to make existing and future defects > easier to find or easier to debug. > > That said, this is the main part I was referring to in the other > thread when I mentioned some of the constructor changes were > potentially controversial and probably not worth the hassle - it's the > only one with the potential to break currently working code, while the > others are just a matter of choosing suitable names. An argument against deprecating bytearray(n) in particular is that this is supported in Python 2. I think I have (ab)used this fact to work around the problem with bytes(n) in Python 2 & 3 compatible code. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Wed, Jun 08, 2016 at 02:17:12AM +0300, Paul Sokolovsky wrote: > Hello, > > On Tue, 07 Jun 2016 15:46:00 -0700 > Ethan Furmanwrote: > > > On 06/07/2016 02:33 PM, Paul Sokolovsky wrote: > > > > >> This PEP proposes to deprecate that behaviour in Python 3.6, and > > >> remove it entirely in Python 3.7. > > > > > > Why the desire to break applications of thousands and thousands of > > > people? I'm not so sure that *thousands* of people are relying on this behaviour, but your point is taken that it is a backwards-incompatible change. > > > Besides, bytes(3) behavior is very logical. Everyone who > > > knows what malloc(3) does also knows what bytes(3) does. Most Python coders are not C coders. Knowing C is not and should not be a pre-requisite for using Python. > > > Who > > > doesn't, can learn, and eventually be grateful that learning Python > > > actually helped them to learn other language as well. I really don't think that learning Python will help with C. > > Two reasons: > > > > 1) bytes are immutable, so creating a 3-byte 0x00 string seems > > ridiculous; > > There's nothing ridiculous in sending N zero bytes over network, > writing to a file, transferring to a hardware device. True, but there is a good way of writing N identical bytes, not limited to nulls, using the replication operator: py> b'\xff'*10 b'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' which is more useful than `bytes(10)` since that can only produce zeroes. > That however > raises questions e.g. how to (efficiently) fill a (subsection) of > bytearray with something but a 0 Slicing. py> b = bytearray(10) py> b[4:4] = b'\xff'*4 py> b bytearray(b'\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00\x00\x00\x00\x00') -- Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hello, On Tue, 07 Jun 2016 15:46:00 -0700 Ethan Furmanwrote: > On 06/07/2016 02:33 PM, Paul Sokolovsky wrote: > > >> This PEP proposes to deprecate that behaviour in Python 3.6, and > >> remove it entirely in Python 3.7. > > > > Why the desire to break applications of thousands and thousands of > > people? Besides, bytes(3) behavior is very logical. Everyone who > > knows what malloc(3) does also knows what bytes(3) does. Who > > doesn't, can learn, and eventually be grateful that learning Python > > actually helped them to learn other language as well. > > Two reasons: > > 1) bytes are immutable, so creating a 3-byte 0x00 string seems > ridiculous; There's nothing ridiculous in sending N zero bytes over network, writing to a file, transferring to a hardware device. That however raises questions e.g. how to (efficiently) fill a (subsection) of bytearray with something but a 0, and how to apply all that consistently to array.array, but I don't even want to bring it, because the answer will be "we need first to deal with subjects of this PEP". > > 2) Python is not C, and the vagaries of malloc are not relevant to > Python. Yes, but Python has always had some traits nicely similar to C, (% formatting, os.read/write at the fingertips, this bytes/bytearray constructor, etc.), and that certainly catered for sizable share of its audience. It's nice that nowadays Python is truly multi-paradigm and taught to pre-schools and used by folks who know how to analyze data much better than how to allocate memory to hold that data in the first place. But hopefully people who used Python since 1.x as a nice system-level integration language, concise without much ambiguity (definitely less than other languages, maybe COBOL excluded) shouldn't suffer and have their stuff broken. > > However, there is little point in breaking working code, so a > deprecation without removal is fine by me. Thanks. > > -- > ~Ethan~ -- Best regards, Paul mailto:pmis...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 7 June 2016 at 15:22, Koos Zevenhovenwrote: > On Wed, Jun 8, 2016 at 12:57 AM, Barry Warsaw wrote: >> On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote: >> >>>On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky wrote: What's wrong with b[i:i+1] ? >>>It always succeeds while indexing can trigger an IndexError. >> >> Right. You want a method with the semantics of __getitem__() but that >> returns >> the desired type. >> > > And if this is called __getitem__ (with slices delegated to > bytes.__getitem__) and implemented in a class, one has a view. Maybe > I'm missing something, but I fail to understand what makes this > significantly more problematic than an iterator. Ok, I guess we might > also need __len__. Right, it's the fact that a view is a much broader API than we need, since most of the operations on the base type are already fine. The two alternate operations that people are interested in are: - like indexing, but producing bytes instead of ints - like iteration, but producing bytes instead of ints That said, it occurs to me that there's a reasonably strong composability argument in favour of a view-based approach: a view will work with operator.itemgetter() and other sequence consuming APIs, while special methods won't. The "like-memoryview-but-not" view type could also take any bytes-like object as input, similar to memoryview itself. Cheers, Nick. P.S. I'm starting to remember why I stopped working on this - I'm genuinely unsure of the right way forward, so I wasn't prepared to advocate strongly for the particular approach in the PEP :) -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 06/07/2016 02:33 PM, Paul Sokolovsky wrote: This PEP proposes to deprecate that behaviour in Python 3.6, and remove it entirely in Python 3.7. Why the desire to break applications of thousands and thousands of people? Besides, bytes(3) behavior is very logical. Everyone who knows what malloc(3) does also knows what bytes(3) does. Who doesn't, can learn, and eventually be grateful that learning Python actually helped them to learn other language as well. Two reasons: 1) bytes are immutable, so creating a 3-byte 0x00 string seems ridiculous; 2) Python is not C, and the vagaries of malloc are not relevant to Python. However, there is little point in breaking working code, so a deprecation without removal is fine by me. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Wed, Jun 8, 2016 at 12:57 AM, Barry Warsawwrote: > On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote: > >>On Tue, 7 Jun 2016 at 14:38 Paul Sokolovsky wrote: >>> What's wrong with b[i:i+1] ? >>It always succeeds while indexing can trigger an IndexError. > > Right. You want a method with the semantics of __getitem__() but that returns > the desired type. > And if this is called __getitem__ (with slices delegated to bytes.__getitem__) and implemented in a class, one has a view. Maybe I'm missing something, but I fail to understand what makes this significantly more problematic than an iterator. Ok, I guess we might also need __len__. -- Koos > -Barry > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/k7hoven%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Jun 07, 2016, at 09:40 PM, Brett Cannon wrote: >On Tue, 7 Jun 2016 at 14:38 Paul Sokolovskywrote: >> What's wrong with b[i:i+1] ? >It always succeeds while indexing can trigger an IndexError. Right. You want a method with the semantics of __getitem__() but that returns the desired type. -Barry pgpKzXeYAKnPj.pgp Description: OpenPGP digital signature ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Ignore that message. I hit send before brain and hands were fully in sync. > -Original Message- > From: tritium-l...@sdamon.com [mailto:tritium-l...@sdamon.com] > Sent: Tuesday, June 7, 2016 5:51 PM > To: 'Nick Coghlan' <ncogh...@gmail.com>; 'Barry Warsaw' > <ba...@python.org> > Cc: python-dev@python.org > Subject: RE: [Python-Dev] PEP 467: Minor API improvements to bytes, > bytearray, and memoryview > > > > > -Original Message- > > From: Python-Dev [mailto:python-dev-bounces+tritium- > > list=sdamon@python.org] On Behalf Of Nick Coghlan > > Sent: Tuesday, June 7, 2016 5:40 PM > > To: Barry Warsaw <ba...@python.org> > > Cc: python-dev@python.org > > Subject: Re: [Python-Dev] PEP 467: Minor API improvements to bytes, > > bytearray, and memoryview > > > > On 7 June 2016 at 14:31, Barry Warsaw <ba...@python.org> wrote: > > > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > > > > > >>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > > >> ``memoryview.iterbytes`` alternative iterators > > > > > > +1 but I want to go just a little farther. > > > > > > We can't change bytes.__getitem__ but we can add another method > that > > returns > > > single byte objects? I think it's still a bit of a pain to extract > single > > > bytes even with .iterbytes(). > > > > > > Maybe .iterbytes can take a single index argument (blech) or add a > method > > like > > > .byte_at(i). I'll let you bikeshed on the name. > > > > Perhaps: > > > > data.getbyte(i) > > data.iterbytes() > > data.getbyte(index_or_slice_object) ? > > while it might not be... ideal... to create a sliceable live view object, we > can have a method that accepts a slice, even if we have to create it > manually (or at least make it convenient for those who wish to wrap a bytes > object in their own type and blindly pass the first-non-self arg of a custom > __getitem__ to the method). > > > The rationale for "Why not a live view?" is that an iterator is simple > > to define and implement, while we know from experience with > memoryview > > and the various dict views that live views are a minefield for folks > > defining new container types. Since this PEP would in some sense > > change what it means to implement a full "bytes-like object", it's > > worth keeping implementation complexity in mind. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > https://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: https://mail.python.org/mailman/options/python- > dev/tritium- > > list%40sdamon.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 7 June 2016 at 14:33, Paul Sokolovskywrote: > Hello, > > On Tue, 07 Jun 2016 13:28:13 -0700 > Ethan Furman wrote: > >> Minor changes: updated version numbers, add punctuation. >> >> The current text seems to take into account Guido's last comments. >> >> Thoughts before asking for acceptance? >> >> > [] > >> Deprecation of current "zero-initialised sequence" behaviour >> >> >> Currently, the ``bytes`` and ``bytearray`` constructors accept an >> integer argument and interpret it as meaning to create a >> zero-initialised sequence of the given size:: >> >> >>> bytes(3) >> b'\x00\x00\x00' >> >>> bytearray(3) >> bytearray(b'\x00\x00\x00') >> >> This PEP proposes to deprecate that behaviour in Python 3.6, and >> remove it entirely in Python 3.7. > > Why the desire to break applications of thousands and thousands of > people? Same argument as any deprecation: to make existing and future defects easier to find or easier to debug. That said, this is the main part I was referring to in the other thread when I mentioned some of the constructor changes were potentially controversial and probably not worth the hassle - it's the only one with the potential to break currently working code, while the others are just a matter of choosing suitable names. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
> -Original Message- > From: Python-Dev [mailto:python-dev-bounces+tritium- > list=sdamon@python.org] On Behalf Of Nick Coghlan > Sent: Tuesday, June 7, 2016 5:40 PM > To: Barry Warsaw <ba...@python.org> > Cc: python-dev@python.org > Subject: Re: [Python-Dev] PEP 467: Minor API improvements to bytes, > bytearray, and memoryview > > On 7 June 2016 at 14:31, Barry Warsaw <ba...@python.org> wrote: > > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > > > >>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > >> ``memoryview.iterbytes`` alternative iterators > > > > +1 but I want to go just a little farther. > > > > We can't change bytes.__getitem__ but we can add another method that > returns > > single byte objects? I think it's still a bit of a pain to extract single > > bytes even with .iterbytes(). > > > > Maybe .iterbytes can take a single index argument (blech) or add a method > like > > .byte_at(i). I'll let you bikeshed on the name. > > Perhaps: > > data.getbyte(i) > data.iterbytes() data.getbyte(index_or_slice_object) ? while it might not be... ideal... to create a sliceable live view object, we can have a method that accepts a slice, even if we have to create it manually (or at least make it convenient for those who wish to wrap a bytes object in their own type and blindly pass the first-non-self arg of a custom __getitem__ to the method). > The rationale for "Why not a live view?" is that an iterator is simple > to define and implement, while we know from experience with memoryview > and the various dict views that live views are a minefield for folks > defining new container types. Since this PEP would in some sense > change what it means to implement a full "bytes-like object", it's > worth keeping implementation complexity in mind. > > Cheers, > Nick. > > -- > Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/tritium- > list%40sdamon.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Tue, 7 Jun 2016 at 14:38 Paul Sokolovskywrote: > Hello, > > On Tue, 7 Jun 2016 17:31:19 -0400 > Barry Warsaw wrote: > > > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > > > > >* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > > > ``memoryview.iterbytes`` alternative iterators > > > > +1 but I want to go just a little farther. > > > > We can't change bytes.__getitem__ but we can add another method that > > returns single byte objects? I think it's still a bit of a pain to > > extract single bytes even with .iterbytes(). > > > > Maybe .iterbytes can take a single index argument (blech) or add a > > method like .byte_at(i). I'll let you bikeshed on the name. > > What's wrong with b[i:i+1] ? > It always succeeds while indexing can trigger an IndexError. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On 7 June 2016 at 14:31, Barry Warsawwrote: > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > >>* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and >> ``memoryview.iterbytes`` alternative iterators > > +1 but I want to go just a little farther. > > We can't change bytes.__getitem__ but we can add another method that returns > single byte objects? I think it's still a bit of a pain to extract single > bytes even with .iterbytes(). > > Maybe .iterbytes can take a single index argument (blech) or add a method like > .byte_at(i). I'll let you bikeshed on the name. Perhaps: data.getbyte(i) data.iterbytes() The rationale for "Why not a live view?" is that an iterator is simple to define and implement, while we know from experience with memoryview and the various dict views that live views are a minefield for folks defining new container types. Since this PEP would in some sense change what it means to implement a full "bytes-like object", it's worth keeping implementation complexity in mind. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Hello, On Tue, 7 Jun 2016 17:31:19 -0400 Barry Warsawwrote: > On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: > > >* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > > ``memoryview.iterbytes`` alternative iterators > > +1 but I want to go just a little farther. > > We can't change bytes.__getitem__ but we can add another method that > returns single byte objects? I think it's still a bit of a pain to > extract single bytes even with .iterbytes(). > > Maybe .iterbytes can take a single index argument (blech) or add a > method like .byte_at(i). I'll let you bikeshed on the name. What's wrong with b[i:i+1] ? -- Best regards, Paul mailto:pmis...@gmail.com ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Tue, Jun 7, 2016 at 11:28 PM, Ethan Furmanwrote: > > Minor changes: updated version numbers, add punctuation. > > The current text seems to take into account Guido's last comments. > > Thoughts before asking for acceptance? > > PEP: 467 > Title: Minor API improvements for binary sequences > Version: $Revision$ > Last-Modified: $Date$ > Author: Nick Coghlan > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 2014-03-30 > Python-Version: 3.5 > Post-History: 2014-03-30 2014-08-15 2014-08-16 > > > Abstract > > > During the initial development of the Python 3 language specification, the > core ``bytes`` type for arbitrary binary data started as the mutable type > that is now referred to as ``bytearray``. Other aspects of operating in the > binary domain in Python have also evolved over the course of the Python 3 > series. > > This PEP proposes four small adjustments to the APIs of the ``bytes``, > ``bytearray`` and ``memoryview`` types to make it easier to operate entirely > in the binary domain: > > * Deprecate passing single integer values to ``bytes`` and ``bytearray`` > * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors > * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors > * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > ``memoryview.iterbytes`` alternative iterators > Why not bytes.viewbytes (or whatever name) so that one could also subscript it? And if it were a property, one could perhaps conveniently get the n'th byte: b'abcde'.viewbytes[n] # compared to b'abcde'[n:n+1] Also, would it not be more clear to call the int -> bytes method something like bytes.fromint or bytes.fromord and introduce the same thing on str? And perhaps allow multiple arguments to create a str/bytes of length > 1. I guess this may violate TOOWTDI, but anyway, just a thought. -- Koos ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
On Jun 07, 2016, at 01:28 PM, Ethan Furman wrote: >* Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > ``memoryview.iterbytes`` alternative iterators +1 but I want to go just a little farther. We can't change bytes.__getitem__ but we can add another method that returns single byte objects? I think it's still a bit of a pain to extract single bytes even with .iterbytes(). Maybe .iterbytes can take a single index argument (blech) or add a method like .byte_at(i). I'll let you bikeshed on the name. Cheers, -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 467: Minor API improvements to bytes, bytearray, and memoryview
Minor changes: updated version numbers, add punctuation. The current text seems to take into account Guido's last comments. Thoughts before asking for acceptance? PEP: 467 Title: Minor API improvements for binary sequences Version: $Revision$ Last-Modified: $Date$ Author: Nick CoghlanStatus: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.5 Post-History: 2014-03-30 2014-08-15 2014-08-16 Abstract During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes four small adjustments to the APIs of the ``bytes``, ``bytearray`` and ``memoryview`` types to make it easier to operate entirely in the binary domain: * Deprecate passing single integer values to ``bytes`` and ``bytearray`` * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and ``memoryview.iterbytes`` alternative iterators Proposals = Deprecation of current "zero-initialised sequence" behaviour Currently, the ``bytes`` and ``bytearray`` constructors accept an integer argument and interpret it as meaning to create a zero-initialised sequence of the given size:: >>> bytes(3) b'\x00\x00\x00' >>> bytearray(3) bytearray(b'\x00\x00\x00') This PEP proposes to deprecate that behaviour in Python 3.6, and remove it entirely in Python 3.7. No other changes are proposed to the existing constructors. Addition of explicit "zero-initialised sequence" constructors - To replace the deprecated behaviour, this PEP proposes the addition of an explicit ``zeros`` alternative constructor as a class method on both ``bytes`` and ``bytearray``:: >>> bytes.zeros(3) b'\x00\x00\x00' >>> bytearray.zeros(3) bytearray(b'\x00\x00\x00') It will behave just as the current constructors behave when passed a single integer. The specific choice of ``zeros`` as the alternative constructor name is taken from the corresponding initialisation function in NumPy (although, as these are 1-dimensional sequence types rather than N-dimensional matrices, the constructors take a length as input rather than a shape tuple). Addition of explicit "single byte" constructors --- As binary counterparts to the text ``chr`` function, this PEP proposes the addition of an explicit ``byte`` alternative constructor as a class method on both ``bytes`` and ``bytearray``:: >>> bytes.byte(3) b'\x03' >>> bytearray.byte(3) bytearray(b'\x03') These methods will only accept integers in the range 0 to 255 (inclusive):: >>> bytes.byte(512) Traceback (most recent call last): File "", line 1, in ValueError: bytes must be in range(0, 256) >>> bytes.byte(1.0) Traceback (most recent call last): File "", line 1, in TypeError: 'float' object cannot be interpreted as an integer The documentation of the ``ord`` builtin will be updated to explicitly note that ``bytes.byte`` is the inverse operation for binary data, while ``chr`` is the inverse operation for text data. Behaviourally, ``bytes.byte(x)`` will be equivalent to the current ``bytes([x])`` (and similarly for ``bytearray``). The new spelling is expected to be easier to discover and easier to read (especially when used in conjunction with indexing operations on binary sequence types). As a separate method, the new spelling will also work better with higher order functions like ``map``. Addition of optimised iterator methods that produce ``bytes`` objects - This PEP proposes that ``bytes``, ``bytearray`` and ``memoryview`` gain an optimised ``iterbytes`` method that produces length 1 ``bytes`` objects rather than integers:: for x in data.iterbytes(): # x is a length 1 ``bytes`` object, rather than an integer The method can be used with arbitrary buffer exporting objects by wrapping them in a ``memoryview`` instance first:: for x in memoryview(data).iterbytes(): # x is a length 1 ``bytes`` object, rather than an integer For ``memoryview``, the semantics of ``iterbytes()`` are defined such that:: memview.tobytes() == b''.join(memview.iterbytes()) This allows the raw bytes of the memory view to be iterated over without needing to make a copy, regardless of the defined shape and format. The main advantage this method offers over the ``map(bytes.byte, data)`` approach is that it
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 09:39 PM, Antoine Pitrou wrote: need for a special case for a single byte. We already have a perfectly good spelling: NUL = bytes([0]) That is actually a very cumbersome spelling. Why should I first create a one-element list in order to create a one-byte bytes object? I feel the same way every time I have to write `set(['foo'])`. -Barry ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 14, 2014, at 10:50 PM, Nick Coghlan ncogh...@gmail.com wrote: Key points in the proposal: * deprecate passing integers to bytes() and bytearray() I'm opposed to removing this part of the API. It has proven useful and the alternative isn't very nice. Declaring the size of fixed length arrays is not a new concept and is widely adopted in other languages. One principal use case for the bytearray is creating and manipulating binary data. Initializing to zero is common operation and should remain part of the core API (consider why we now have list.copy() even though copying with a slice remains possible and efficient). I and my clients have taken advantage of this feature and it reads nicely. The proposed deprecation would break our code and not actually make anything better. Another thought is that the core devs should be very reluctant to deprecate anything we don't have to while the 2 to 3 transition is still in progress. Every new deprecation of APIs that existed in Python 2.7 just adds another obstacle to converting code. Individually, the differences are trivial. Collectively, they present a good reason to never migrate code to Python 3. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 17 August 2014 18:13, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Aug 14, 2014, at 10:50 PM, Nick Coghlan ncogh...@gmail.com wrote: Key points in the proposal: * deprecate passing integers to bytes() and bytearray() I'm opposed to removing this part of the API. It has proven useful and the alternative isn't very nice. Declaring the size of fixed length arrays is not a new concept and is widely adopted in other languages. One principal use case for the bytearray is creating and manipulating binary data. Initializing to zero is common operation and should remain part of the core API (consider why we now have list.copy() even though copying with a slice remains possible and efficient). That's why the PEP proposes adding a zeros method, based on the name of the corresponding NumPy construct. The status quo has some very ugly failure modes when an integer is passed unexpectedly, and tries to create a large buffer, rather than throwing a type error. I and my clients have taken advantage of this feature and it reads nicely. If I see bytearray(10) there is nothing there that suggests this creates an array of length 10 and initialises it to zero to me. I'd be more inclined to guess it would be equivalent to bytearray([10]). bytearray.zeros(10), on the other hand, is relatively clear, independently of user expectations. The proposed deprecation would break our code and not actually make anything better. Another thought is that the core devs should be very reluctant to deprecate anything we don't have to while the 2 to 3 transition is still in progress. Every new deprecation of APIs that existed in Python 2.7 just adds another obstacle to converting code. Individually, the differences are trivial. Collectively, they present a good reason to never migrate code to Python 3. This is actually one of the inconsistencies between the Python 2 and 3 binary APIs: Python 2.7.5 (default, Jun 25 2014, 10:19:55) [GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2 Type help, copyright, credits or license for more information. bytes(10) '10' bytearray(10) bytearray(b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00') Users wanting well-behaved binary sequences in Python 2.7 would be well advised to use the future module to get a full backport of the actual Python 3 bytes type, rather than the approximation that is the 8-bit str in Python 2. And once they do that, they'll be able to track the evolution of the Python 3 binary sequence behaviour without any further trouble. That said, I don't really mind how long the deprecation cycle is. I'd be fine with fully supporting both in 3.5 (2015), deprecating the main constructor in favour of the explicit zeros() method in 3.6 (2017) and dropping the legacy behaviour in 3.7 (2018) Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 1:41 AM, Nick Coghlan ncogh...@gmail.com wrote: If I see bytearray(10) there is nothing there that suggests this creates an array of length 10 and initialises it to zero to me. I'd be more inclined to guess it would be equivalent to bytearray([10]). bytearray.zeros(10), on the other hand, is relatively clear, independently of user expectations. Zeros would have been great but that should have been done originally. The time to get API design right is at inception. Now, you're just breaking code and invalidating any published examples. Another thought is that the core devs should be very reluctant to deprecate anything we don't have to while the 2 to 3 transition is still in progress. Every new deprecation of APIs that existed in Python 2.7 just adds another obstacle to converting code. Individually, the differences are trivial. Collectively, they present a good reason to never migrate code to Python 3. This is actually one of the inconsistencies between the Python 2 and 3 binary APIs: However, bytearray(n) is the same in both Python 2 and Python 3. Changing it in Python 3 increases the gulf between the two. The further we let Python 3 diverge from Python 2, the less likely that people will convert their code and the harder you make it to write code that runs under both. FWIW, I've been teaching Python full time for three years. I cover the use of bytearray(n) in my classes and not a single person out of 3000+ engineers have had a problem with it. I seriously question the PEP's assertion that there is a real problem to be solved (i.e. that people are baffled by bytearray(bufsiz)) and that the problem is sufficiently painful to warrant the headaches that go along with API changes. The other proposal to add bytearray.byte(3) should probably be named bytearray.from_byte(3) for clarity. That said, I question whether there is actually a use case for this. I have never seen seen code that has a need to create a byte array of length one from a single integer. For the most part, the API will be easiest to learn if it matches what we do for lists and for array.array. Sorry Nick, but I think you're making the API worse instead of better. This API isn't perfect but it isn't flat-out broken either. There is some unfortunate asymmetry between bytes() and bytearray() in Python 2, but that ship has sailed. The current API for Python 3 is pretty good (though there is still a tension between wanting to be like lists and like strings both at the same time). Raymond P.S. The most important problem in the Python world now is getting Python 2 users to adopt Python 3. The core devs need to develop a strong distaste for anything that makes that problem harder. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 1:07 PM, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Aug 17, 2014, at 1:41 AM, Nick Coghlan ncogh...@gmail.com mailto:ncogh...@gmail.com wrote: If I see bytearray(10) there is nothing there that suggests this creates an array of length 10 and initialises it to zero to me. I'd be more inclined to guess it would be equivalent to bytearray([10]). bytearray.zeros(10), on the other hand, is relatively clear, independently of user expectations. Zeros would have been great but that should have been done originally. The time to get API design right is at inception. Now, you're just breaking code and invalidating any published examples. Another thought is that the core devs should be very reluctant to deprecate anything we don't have to while the 2 to 3 transition is still in progress. Every new deprecation of APIs that existed in Python 2.7 just adds another obstacle to converting code. Individually, the differences are trivial. Collectively, they present a good reason to never migrate code to Python 3. This is actually one of the inconsistencies between the Python 2 and 3 binary APIs: However, bytearray(n) is the same in both Python 2 and Python 3. Changing it in Python 3 increases the gulf between the two. The further we let Python 3 diverge from Python 2, the less likely that people will convert their code and the harder you make it to write code that runs under both. FWIW, I've been teaching Python full time for three years. I cover the use of bytearray(n) in my classes and not a single person out of 3000+ engineers have had a problem with it. I seriously question the PEP's assertion that there is a real problem to be solved (i.e. that people are baffled by bytearray(bufsiz)) and that the problem is sufficiently painful to warrant the headaches that go along with API changes. The other proposal to add bytearray.byte(3) should probably be named bytearray.from_byte(3) for clarity. That said, I question whether there is actually a use case for this. I have never seen seen code that has a need to create a byte array of length one from a single integer. For the most part, the API will be easiest to learn if it matches what we do for lists and for array.array. Sorry Nick, but I think you're making the API worse instead of better. This API isn't perfect but it isn't flat-out broken either. There is some unfortunate asymmetry between bytes() and bytearray() in Python 2, but that ship has sailed. The current API for Python 3 is pretty good (though there is still a tension between wanting to be like lists and like strings both at the same time). Raymond P.S. The most important problem in the Python world now is getting Python 2 users to adopt Python 3. The core devs need to develop a strong distaste for anything that makes that problem harder. For the record I’ve had all of the problems that Nick states and I’m +1 on this change. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 08/17/2014 10:16 AM, Donald Stufft wrote: For the record I’ve had all of the problems that Nick states and I’m +1 on this change. I've had many of the problems Nick states and I'm also +1. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 11:33 AM, Ethan Furman et...@stoneleaf.us wrote: I've had many of the problems Nick states and I'm also +1. There are two code snippets below which were taken from the standard library. Are you saying that: 1) you don't understand the code (as the pep suggests) 2) you are willing to break that code and everything like it 3) and it would be more elegantly expressed as: charmap = bytearray.zeros(256) and mapping = bytearray.zeros(256) At work, I have network engineers creating IPv4 headers and other structures with bytearrays initialized to zeros. Do you really want to break all their code? No where else in Python do we create buffers that way. Code like msg, who = s.recvfrom(256) is the norm. Also, it is unclear if you're saying that you have an actual use case for this part of the proposal? ba = bytearray.byte(65) And than the code would be better, clearer, and faster than the currently working form? ba = bytearray([65]) Does there really need to be a special case for constructing a single byte? To me, that is akin to proposing list.from_int(65) as an important special case to replace [65]. If you must muck with the ever changing bytes() API, then please leave the bytearray() API alone. I think we should show some respect for code that is currently working and is cleanly expressible in both Python 2 and Python 3. We aren't winning users with API churn. FWIW, I guessing that the differing view points in the thread stem mainly from the proponents experiences with bytes() rather than from experience with bytearray() which doesn't seem to have any usage problems in the wild. I've never seen a developer say they didn't understand what buf = bytearray(1024) means. That is not an actual problem that needs solving (or breaking). What may be an actual problem is code like char = bytes(1024) though I'm unclear what a user might have actually been trying to do with code like that. Raymond --- excerpts from Lib/sre_compile.py --- charmap = bytearray(256) for op, av in charset: while True: try: if op is LITERAL: charmap[fixup(av)] = 1 elif op is RANGE: for i in range(fixup(av[0]), fixup(av[1])+1): charmap[i] = 1 elif op is NEGATE: out.append((op, av)) else: tail.append((op, av)) ... charmap = bytes(charmap) # should be hashable comps = {} mapping = bytearray(256) block = 0 data = bytearray() for i in range(0, 65536, 256): chunk = charmap[i: i + 256] if chunk in comps: mapping[i // 256] = comps[chunk] else: mapping[i // 256] = comps[chunk] = block block += 1 data += chunk data = _mk_bitmap(data) data[0:0] = [block] + _bytes_to_codes(mapping) out.append((BIGCHARSET, data)) out += tail return out___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 5:19 PM, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Aug 17, 2014, at 11:33 AM, Ethan Furman et...@stoneleaf.us mailto:et...@stoneleaf.us wrote: I've had many of the problems Nick states and I'm also +1. There are two code snippets below which were taken from the standard library. Are you saying that: 1) you don't understand the code (as the pep suggests) 2) you are willing to break that code and everything like it 3) and it would be more elegantly expressed as: charmap = bytearray.zeros(256) and mapping = bytearray.zeros(256) At work, I have network engineers creating IPv4 headers and other structures with bytearrays initialized to zeros. Do you really want to break all their code? No where else in Python do we create buffers that way. Code like msg, who = s.recvfrom(256) is the norm. Also, it is unclear if you're saying that you have an actual use case for this part of the proposal? ba = bytearray.byte(65) And than the code would be better, clearer, and faster than the currently working form? ba = bytearray([65]) Does there really need to be a special case for constructing a single byte? To me, that is akin to proposing list.from_int(65) as an important special case to replace [65]. If you must muck with the ever changing bytes() API, then please leave the bytearray() API alone. I think we should show some respect for code that is currently working and is cleanly expressible in both Python 2 and Python 3. We aren't winning users with API churn. FWIW, I guessing that the differing view points in the thread stem mainly from the proponents experiences with bytes() rather than from experience with bytearray() which doesn't seem to have any usage problems in the wild. I've never seen a developer say they didn't understand what buf = bytearray(1024) means. That is not an actual problem that needs solving (or breaking). What may be an actual problem is code like char = bytes(1024) though I'm unclear what a user might have actually been trying to do with code like that. I think this is probably correct. I generally don’t think that bytes(1024) makes much sense at all, especially not as a default constructor. Most likely it exists to be similar to bytearray(). I don't have a specific problem with bytearray(1024), though I do think it's more elegantly and clearly described as bytearray.zeros(1024), but not by much. I find bytes.byte()/bytearray to be needed as long as there isn't a simple way to iterate over a bytes or bytearray in a way that yields bytes or bytearrays instead of integers. To be honest I can't think of a time when I'd actually *want* to iterate over a bytes/bytearray as integers. Although I realize there is unlikely to be a reasonable method to change that now. If iterbytes is added I'm not sure where i'd personally use either bytes.byte() or bytearray.byte(). In general though I think that overloading a single constructor method to do something conceptually different based on the type of the parameter leads to these kind of confusing scenarios and that having differently named constructors for the different concepts is far clearer. So given all that, I am: * +1 for some method of iterating over both types as bytes instead of integers. * +1 on adding .zeros to both types as an alternative and preferred method of creating a zero filled instance and deprecating the original method[1]. * -0 on adding .byte to both types as an alternative method of creating a single byte instance. * -1 On changing the meaning of bytearray(1024). * +/-0 on changing the meaning of bytes(1024), I think that bytes(1024) is likely to *not* be what someone wants and that what they really want is bytes([N]). I also think that the number one reason for someone to be doing bytes(N) is because they were attempting to iterate over a bytes or bytearray object and they got an integer. I also think that it's bad that this changes from 2.x to 3.x and I wish it hadn't. However I can't decide if it's worth reverting this at this time or not. [1] By deprecating I mean, raise a deprecation warning, or something but my thoughts on actually removing the other methods are listed explicitly. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
Le 17/08/2014 13:07, Raymond Hettinger a écrit : FWIW, I've been teaching Python full time for three years. I cover the use of bytearray(n) in my classes and not a single person out of 3000+ engineers have had a problem with it. This is less about bytearray() than bytes(), IMO. bytearray() is sufficiently specialized that only experienced people will encounter it. And while preallocating a bytearray of a certain size makes sense, it's completely pointless for a bytes object. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 18 Aug 2014 03:07, Raymond Hettinger raymond.hettin...@gmail.com wrote: On Aug 17, 2014, at 1:41 AM, Nick Coghlan ncogh...@gmail.com wrote: If I see bytearray(10) there is nothing there that suggests this creates an array of length 10 and initialises it to zero to me. I'd be more inclined to guess it would be equivalent to bytearray([10]). bytearray.zeros(10), on the other hand, is relatively clear, independently of user expectations. Zeros would have been great but that should have been done originally. The time to get API design right is at inception. Now, you're just breaking code and invalidating any published examples. I'm fine with postponing the deprecation elements indefinitely (or just deprecating bytes(int) and leaving bytearray(int) alone). Another thought is that the core devs should be very reluctant to deprecate anything we don't have to while the 2 to 3 transition is still in progress. Every new deprecation of APIs that existed in Python 2.7 just adds another obstacle to converting code. Individually, the differences are trivial. Collectively, they present a good reason to never migrate code to Python 3. This is actually one of the inconsistencies between the Python 2 and 3 binary APIs: However, bytearray(n) is the same in both Python 2 and Python 3. Changing it in Python 3 increases the gulf between the two. The further we let Python 3 diverge from Python 2, the less likely that people will convert their code and the harder you make it to write code that runs under both. FWIW, I've been teaching Python full time for three years. I cover the use of bytearray(n) in my classes and not a single person out of 3000+ engineers have had a problem with it. I seriously question the PEP's assertion that there is a real problem to be solved (i.e. that people are baffled by bytearray(bufsiz)) and that the problem is sufficiently painful to warrant the headaches that go along with API changes. Yes, I'd expect engineers and networking folks to be fine with it. It isn't how this mode of the constructor *works* that worries me, it's how it *fails* (i.e. silently producing unexpected data rather than a type error). Purely deprecating the bytes case and leaving bytearray alone would likely address my concerns. The other proposal to add bytearray.byte(3) should probably be named bytearray.from_byte(3) for clarity. That said, I question whether there is actually a use case for this. I have never seen seen code that has a need to create a byte array of length one from a single integer. For the most part, the API will be easiest to learn if it matches what we do for lists and for array.array. This part of the proposal came from a few things: * many of the bytes and bytearray methods only accept bytes-like objects, but iteration and indexing produce integers * to mitigate the impact of the above, some (but not all) bytes and bytearray methods now accept integers in addition to bytes-like objects * ord() in Python 3 is only documented as accepting length 1 strings, but also accepts length 1 bytes-like objects Adding bytes.byte() makes it practical to document the binary half of ord's behaviour, and eliminates any temptation to expand the also accepts integers behaviour out to more types. bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2 had both chr() and unichr(). I don't recall ever needing chr() in a real program either, but I still consider it an important part of clearly articulating the data model. Sorry Nick, but I think you're making the API worse instead of better. This API isn't perfect but it isn't flat-out broken either. There is some unfortunate asymmetry between bytes() and bytearray() in Python 2, but that ship has sailed. The current API for Python 3 is pretty good (though there is still a tension between wanting to be like lists and like strings both at the same time). Yes. It didn't help that the docs previously expected readers to infer the behaviour of the binary sequence methods from the string documentation - while the new docs could still use some refinement, I've at least addressed that part of the problem. Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
Le 16/08/2014 01:17, Nick Coghlan a écrit : * Deprecate passing single integer values to ``bytes`` and ``bytearray`` I'm neutral. Ideally we wouldn't have done that mistake at the beginning. * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and ``memoryview.iterbytes`` alternative iterators +0.5. iterbytes isn't really great as a name. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Aug 17, 2014, at 4:08 PM, Nick Coghlan ncogh...@gmail.com wrote: Purely deprecating the bytes case and leaving bytearray alone would likely address my concerns. That is good progress. Thanks :-) Would a warning for the bytes case suffice, do you need an actual deprecation? bytes.byte() thus becomes the binary equivalent of chr(), just as Python 2 had both chr() and unichr(). I don't recall ever needing chr() in a real program either, but I still consider it an important part of clearly articulating the data model. I don't recall having ever needed this greatly weakens the premise that this is needed :-) The APIs have been around since 2.6 and AFAICT there have been zero demonstrated need for a special case for a single byte. We already have a perfectly good spelling: NUL = bytes([0]) The Zen tells us we really don't need a second way to do it (actually a third since you can also write b'\x00') and it suggests that this special case isn't special enough. I encourage restraint against adding an unneeded class method that has no parallel elsewhere. Right now, the learning curve is mitigated because bytes is very str-like and because bytearray is list-like (i.e. the method names have been used elsewhere and likely already learned before encountering bytes() or bytearray()). Putting in new, rarely used funky method adds to the learning burden. If you do press forward with adding it (and I don't see why), then as an alternate constructor, the name should be from_int() or some such to avoid ambiguity and to make clear that it is a class method. iterbytes() isn't especially attractive as a method name, but it's far more explicit about its purpose. I concur. In this case, explicitness matters. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 18 Aug 2014 09:41, Raymond Hettinger raymond.hettin...@gmail.com wrote: I encourage restraint against adding an unneeded class method that has no parallel elsewhere. Right now, the learning curve is mitigated because bytes is very str-like and because bytearray is list-like (i.e. the method names have been used elsewhere and likely already learned before encountering bytes() or bytearray()). Putting in new, rarely used funky method adds to the learning burden. If you do press forward with adding it (and I don't see why), then as an alternate constructor, the name should be from_int() or some such to avoid ambiguity and to make clear that it is a class method. If I remember the sequence of events correctly, I thought of map(bytes.byte, data) first, and then Guido suggested a dedicated iterbytes() method later. The step I hadn't taken (until now) was realising that the new memoryview(data).iterbytes() capability actually combines with the existing (bytes([b]) for b in data) to make the original bytes.byte idea unnecessary. Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
Le 17/08/2014 19:41, Raymond Hettinger a écrit : The APIs have been around since 2.6 and AFAICT there have been zero demonstrated need for a special case for a single byte. We already have a perfectly good spelling: NUL = bytes([0]) That is actually a very cumbersome spelling. Why should I first create a one-element list in order to create a one-byte bytes object? The Zen tells us we really don't need a second way to do it (actually a third since you can also write b'\x00') and it suggests that this special case isn't special enough. b'\x00' is obviously the right way to do it in this case, but we're concerned about the non-constant case. The reason to instantiate bytes from non-constant integer comes from the unfortunate indexing and iteration behaviour of bytes objects. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 08/17/2014 02:19 PM, Raymond Hettinger wrote: On Aug 17, 2014, at 11:33 AM, Ethan Furman wrote: I've had many of the problems Nick states and I'm also +1. There are two code snippets below which were taken from the standard library. [...] My issues are with 'bytes', not 'bytearray'. 'bytearray(10)' actually makes sense. I certainly have no problem with bytearray and bytes not being exactly the same. My primary issues with bytes is not being able to do b'abc'[2] == b'c', and with not being able to do x = b'abc'[2]; y = bytes(x); assert y == b'c'. And because of the backwards compatibility issues I would deprecate, because we have a new 'better' way, but not remove, the current functionality. I pretty much agree exactly with what Donald Stufft said about it. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 08/17/2014 04:08 PM, Nick Coghlan wrote: I'm fine with postponing the deprecation elements indefinitely (or just deprecating bytes(int) and leaving bytearray(int) alone). +1 on both pieces. -- ~Ethan~ ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Sun, Aug 17, 2014 at 8:52 PM, Ethan Furman et...@stoneleaf.us wrote: On 08/17/2014 04:08 PM, Nick Coghlan wrote: I'm fine with postponing the deprecation elements indefinitely (or just deprecating bytes(int) and leaving bytearray(int) alone). +1 on both pieces. Perhaps postpone the deprecation to Python 4000 ;) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
Donald Stufft donald at stufft.io writes: For the record I’ve had all of the problems that Nick states and I’m +1 on this change. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA I've hit basically every problem everyone here has stated, and in no uncertain terms am I completely opposed to deprecating anything. The Python 2 to 3 migration is already hard enough, and already proceeding far too slowly for many of our tastes. Making that migration even more complex would drive me to the point of giving up. Alex ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On Sun, Aug 17, 2014 at 7:14 PM, Alex Gaynor alex.gay...@gmail.com wrote: I've hit basically every problem everyone here has stated, and in no uncertain terms am I completely opposed to deprecating anything. The Python 2 to 3 migration is already hard enough, and already proceeding far too slowly for many of our tastes. Making that migration even more complex would drive me to the point of giving up. Could you elaborate what problems you are thinking this will cause for you? It seems to me that avoiding a bug-prone API is not particularly complex, and moving it back to its 2.x semantics or making it not work entirely, rather than making it work differently, would make porting applications easier. If, during porting to 3.x, you find a deprecation warning for bytes(n), then rather than being annoying code churny extra changes, this is actually a bug that's been identified. So it's helpful even during the deprecation period. -- Devin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
This feels chatty. I'd like the PEP to call out the specific proposals and put the more verbose motivation later. It took me a long time to realize that you don't want to deprecate bytes([1, 2, 3]), but only bytes(3). Also your mention of bytes.byte() as the counterpart to ord() confused me -- I think it's more similar to chr(). I don't like iterbytes as a builtin, let's keep it as a method on affected types. On Thu, Aug 14, 2014 at 10:50 PM, Nick Coghlan ncogh...@gmail.com wrote: I just posted an updated version of PEP 467 after recently finishing the updates to the Python 3.4+ binary sequence docs to decouple them from the str docs. Key points in the proposal: * deprecate passing integers to bytes() and bytearray() * add bytes.zeros() and bytearray.zeros() as a replacement * add bytes.byte() and bytearray.byte() as counterparts to ord() for binary data * add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes() As far as I am aware, that last item poses the only open question, with the alternative being to add an iterbytes builtin with a definition along the lines of the following: def iterbytes(data): try: getiter = type(data).__iterbytes__ except AttributeError: iter = map(bytes.byte, data) else: iter = getiter(data) return iter Regards, Nick. PEP URL: http://www.python.org/dev/peps/pep-0467/ Full PEP text: = PEP: 467 Title: Minor API improvements for bytes and bytearray Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan ncogh...@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.5 Post-History: 2014-03-30 2014-08-15 Abstract During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes a number of small adjustments to the APIs of the ``bytes`` and ``bytearray`` types to make it easier to operate entirely in the binary domain. Background == To simplify the task of writing the Python 3 documentation, the ``bytes`` and ``bytearray`` types were documented primarily in terms of the way they differed from the Unicode based Python 3 ``str`` type. Even when I `heavily revised the sequence documentation http://hg.python.org/cpython/rev/463f52d20314`__ in 2012, I retained that simplifying shortcut. However, it turns out that this approach to the documentation of these types had a problem: it doesn't adequately introduce users to their hybrid nature, where they can be manipulated *either* as a sequence of integers type, *or* as ``str``-like types that assume ASCII compatible data. That oversight has now been corrected, with the binary sequence types now being documented entirely independently of the ``str`` documentation in `Python 3.4+ https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview `__ The confusion isn't just a documentation issue, however, as there are also some lingering design quirks from an earlier pre-release design where there was *no* separate ``bytearray`` type, and instead the core ``bytes`` type was mutable (with no immutable counterpart). Finally, additional experience with using the existing Python 3 binary sequence types in real world applications has suggested it would be beneficial to make it easier to convert integers to length 1 bytes objects. Proposals = As a consistency improvement proposal, this PEP is actually about a few smaller micro-proposals, each aimed at improving the usability of the binary data model in Python 3. Proposals are motivated by one of two main factors: * removing remnants of the original design of ``bytes`` as a mutable type * allowing users to easily convert integer values to a length 1 ``bytes`` object Alternate Constructors -- The ``bytes`` and ``bytearray`` constructors currently accept an integer argument, but interpret it to mean a zero-filled object of the given length. This is a legacy of the original design of ``bytes`` as a mutable type, rather than a particularly intuitive behaviour for users. It has become especially confusing now that some other ``bytes`` interfaces treat integers and the corresponding length 1 bytes instances as equivalent input. Compare:: b\x03 in bytes([1, 2, 3]) True 3 in bytes([1, 2, 3]) True bytes(b\x03) b'\x03' bytes(3) b'\x00\x00\x00' This PEP proposes that the current handling of integers in the bytes and bytearray constructors by deprecated in Python 3.5 and targeted for removal in Python 3.7, being replaced by two more
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
15.08.14 08:50, Nick Coghlan написав(ла): * add bytes.zeros() and bytearray.zeros() as a replacement b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need to learn new method. And it works right now. * add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes() What are use cases for this? I suppose that main use case may be writing the code compatible with 2.7 and 3.x. But in this case you need a wrapper (because these types in 2.7 have no the iterbytes() method). And how larger would be an advantage of this method over the ``map(bytes.byte, data)``? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
2014-08-15 21:54 GMT+02:00 Serhiy Storchaka storch...@gmail.com: 15.08.14 08:50, Nick Coghlan написав(ла): * add bytes.zeros() and bytearray.zeros() as a replacement b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need to learn new method. And it works right now. FYI there is a pending patch for bytearray(int) to use calloc() instead of malloc(). It's faster for buffer for n larger than 1 MB: http://bugs.python.org/issue21644 I'm not sure that the optimization is really useful. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
2014-08-15 7:50 GMT+02:00 Nick Coghlan ncogh...@gmail.com: As far as I am aware, that last item poses the only open question, with the alternative being to add an iterbytes builtin (...) Do you have examples of use cases for a builtin function? I only found 5 usages of bytes((byte,)) constructor in the standard library: $ grep -E 'bytes\(\([^)]+, *\)\)' $(find -name *.py) ./Lib/quopri.py:c = bytes((c,)) ./Lib/quopri.py:c = bytes((c,)) ./Lib/base64.py:b32tab = [bytes((i,)) for i in _b32alphabet] ./Lib/base64.py:_a85chars = [bytes((i,)) for i in range(33, 118)] ./Lib/base64.py:_b85chars = [bytes((i,)) for i in _b85alphabet] bytes.iterbytes() can be used in 4 cases on 5. Adding a new builtin for a single line in the whole standard library doesn't look right. Victor ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 467: Minor API improvements for bytes bytearray
On 16 August 2014 03:48, Guido van Rossum gu...@python.org wrote: This feels chatty. I'd like the PEP to call out the specific proposals and put the more verbose motivation later. I realised that some of that history was actually completely irrelevant now, so I culled a fair bit of it entirely. It took me a long time to realize that you don't want to deprecate bytes([1, 2, 3]), but only bytes(3). I've split out the four subproposals into their own sections, so hopefully this is clearer now. Also your mention of bytes.byte() as the counterpart to ord() confused me -- I think it's more similar to chr(). This was just a case of me using the wrong word - I meant inverse rather than counterpart. I don't like iterbytes as a builtin, let's keep it as a method on affected types. Done. I also added an explanation of the benefits it offers over the more generic map(bytes.byte, data), as well as more precise semantics for how it will work with memoryview objects. New draft is live at http://www.python.org/dev/peps/pep-0467/, as well as being included inline below. Regards, Nick. === PEP: 467 Title: Minor API improvements for bytes and bytearray Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan ncogh...@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.5 Post-History: 2014-03-30 2014-08-15 2014-08-16 Abstract During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes four small adjustments to the APIs of the ``bytes``, ``bytearray`` and ``memoryview`` types to make it easier to operate entirely in the binary domain: * Deprecate passing single integer values to ``bytes`` and ``bytearray`` * Add ``bytes.zeros`` and ``bytearray.zeros`` alternative constructors * Add ``bytes.byte`` and ``bytearray.byte`` alternative constructors * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and ``memoryview.iterbytes`` alternative iterators Proposals = Deprecation of current zero-initialised sequence behaviour Currently, the ``bytes`` and ``bytearray`` constructors accept an integer argument and interpret it as meaning to create a zero-initialised sequence of the given size:: bytes(3) b'\x00\x00\x00' bytearray(3) bytearray(b'\x00\x00\x00') This PEP proposes to deprecate that behaviour in Python 3.5, and remove it entirely in Python 3.6. No other changes are proposed to the existing constructors. Addition of explicit zero-initialised sequence constructors - To replace the deprecated behaviour, this PEP proposes the addition of an explicit ``zeros`` alternative constructor as a class method on both ``bytes`` and ``bytearray``:: bytes.zeros(3) b'\x00\x00\x00' bytearray.zeros(3) bytearray(b'\x00\x00\x00') It will behave just as the current constructors behave when passed a single integer. The specific choice of ``zeros`` as the alternative constructor name is taken from the corresponding initialisation function in NumPy (although, as these are 1-dimensional sequence types rather than N-dimensional matrices, the constructors take a length as input rather than a shape tuple) Addition of explicit single byte constructors --- As binary counterparts to the text ``chr`` function, this PEP proposes the addition of an explicit ``byte`` alternative constructor as a class method on both ``bytes`` and ``bytearray``:: bytes.byte(3) b'\x03' bytearray.byte(3) bytearray(b'\x03') These methods will only accept integers in the range 0 to 255 (inclusive):: bytes.byte(512) Traceback (most recent call last): File stdin, line 1, in module ValueError: bytes must be in range(0, 256) bytes.byte(1.0) Traceback (most recent call last): File stdin, line 1, in module TypeError: 'float' object cannot be interpreted as an integer The documentation of the ``ord`` builtin will be updated to explicitly note that ``bytes.byte`` is the inverse operation for binary data, while ``chr`` is the inverse operation for text data. Behaviourally, ``bytes.byte(x)`` will be equivalent to the current ``bytes([x])`` (and similarly for ``bytearray``). The new spelling is expected to be easier to discover and easier to read (especially when used in conjunction with indexing operations on binary sequence types). As a separate method, the new spelling will also work better with higher order functions like ``map``. Addition of optimised iterator methods that produce ``bytes`` objects
[Python-Dev] PEP 467: Minor API improvements for bytes bytearray
I just posted an updated version of PEP 467 after recently finishing the updates to the Python 3.4+ binary sequence docs to decouple them from the str docs. Key points in the proposal: * deprecate passing integers to bytes() and bytearray() * add bytes.zeros() and bytearray.zeros() as a replacement * add bytes.byte() and bytearray.byte() as counterparts to ord() for binary data * add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes() As far as I am aware, that last item poses the only open question, with the alternative being to add an iterbytes builtin with a definition along the lines of the following: def iterbytes(data): try: getiter = type(data).__iterbytes__ except AttributeError: iter = map(bytes.byte, data) else: iter = getiter(data) return iter Regards, Nick. PEP URL: http://www.python.org/dev/peps/pep-0467/ Full PEP text: = PEP: 467 Title: Minor API improvements for bytes and bytearray Version: $Revision$ Last-Modified: $Date$ Author: Nick Coghlan ncogh...@gmail.com Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 2014-03-30 Python-Version: 3.5 Post-History: 2014-03-30 2014-08-15 Abstract During the initial development of the Python 3 language specification, the core ``bytes`` type for arbitrary binary data started as the mutable type that is now referred to as ``bytearray``. Other aspects of operating in the binary domain in Python have also evolved over the course of the Python 3 series. This PEP proposes a number of small adjustments to the APIs of the ``bytes`` and ``bytearray`` types to make it easier to operate entirely in the binary domain. Background == To simplify the task of writing the Python 3 documentation, the ``bytes`` and ``bytearray`` types were documented primarily in terms of the way they differed from the Unicode based Python 3 ``str`` type. Even when I `heavily revised the sequence documentation http://hg.python.org/cpython/rev/463f52d20314`__ in 2012, I retained that simplifying shortcut. However, it turns out that this approach to the documentation of these types had a problem: it doesn't adequately introduce users to their hybrid nature, where they can be manipulated *either* as a sequence of integers type, *or* as ``str``-like types that assume ASCII compatible data. That oversight has now been corrected, with the binary sequence types now being documented entirely independently of the ``str`` documentation in `Python 3.4+ https://docs.python.org/3/library/stdtypes.html#binary-sequence-types-bytes-bytearray-memoryview`__ The confusion isn't just a documentation issue, however, as there are also some lingering design quirks from an earlier pre-release design where there was *no* separate ``bytearray`` type, and instead the core ``bytes`` type was mutable (with no immutable counterpart). Finally, additional experience with using the existing Python 3 binary sequence types in real world applications has suggested it would be beneficial to make it easier to convert integers to length 1 bytes objects. Proposals = As a consistency improvement proposal, this PEP is actually about a few smaller micro-proposals, each aimed at improving the usability of the binary data model in Python 3. Proposals are motivated by one of two main factors: * removing remnants of the original design of ``bytes`` as a mutable type * allowing users to easily convert integer values to a length 1 ``bytes`` object Alternate Constructors -- The ``bytes`` and ``bytearray`` constructors currently accept an integer argument, but interpret it to mean a zero-filled object of the given length. This is a legacy of the original design of ``bytes`` as a mutable type, rather than a particularly intuitive behaviour for users. It has become especially confusing now that some other ``bytes`` interfaces treat integers and the corresponding length 1 bytes instances as equivalent input. Compare:: b\x03 in bytes([1, 2, 3]) True 3 in bytes([1, 2, 3]) True bytes(b\x03) b'\x03' bytes(3) b'\x00\x00\x00' This PEP proposes that the current handling of integers in the bytes and bytearray constructors by deprecated in Python 3.5 and targeted for removal in Python 3.7, being replaced by two more explicit alternate constructors provided as class methods. The initial python-ideas thread [ideas-thread1]_ that spawned this PEP was specifically aimed at deprecating this constructor behaviour. Firstly, a ``byte`` constructor is proposed that converts integers in the range 0 to 255 (inclusive) to a ``bytes`` object:: bytes.byte(3) b'\x03' bytearray.byte(3) bytearray(b'\x03') bytes.byte(512) Traceback (most recent call last): File stdin, line 1, in module ValueError: bytes must be in range(0, 256) One specific use case for this alternate constructor is