(Thanks for moving this forward, Ethan!) On 19 July 2016 at 06:17, Ethan Furman <et...@stoneleaf.us> wrote: > * Add ``bytes.getbyte`` and ``bytearray.getbyte`` byte retrieval methods > * Add ``bytes.iterbytes``, ``bytearray.iterbytes`` and > ``memoryview.iterbytes`` alternative iterators
As a possible alternative to this aspect, what if we adjusted memorview.cast() to also support the "s" format code from the struct module? At the moment, trying to use "s" gives a value error: >>> bview = memoryview(data).cast("s") Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: memoryview: destination format must be a native single character format prefixed with an optional '@' However, it could be supported by always interpreting it as equivalent to "1s", such that the view produced length 1 bytes objects on indexing and iteration, rather than integers (which is what it does given the default "b" format). Given "memoryview(data).cast('s')" as a basic building block, most of the other aspects of working with bytes objects as if they were Python 2 strings should become relatively straightforward, so the question would be whether we wanted to make it easy for people to avoid constructing the mediating memoryview object. > Proposals > ========= > > Deprecation of current "zero-initialised sequence" behaviour without removal > ---------------------------------------------------------------------------- > > Currently, the ``bytes`` and ``bytearray`` constructors accept an integer > argument and interpret it as meaning to create a zero-initialised sequence > of the given size:: > > >>> bytes(3) > b'\x00\x00\x00' > >>> bytearray(3) > bytearray(b'\x00\x00\x00') > > This PEP proposes to deprecate that behaviour in Python 3.6, but to leave > it in place for at least as long as Python 2.7 is supported, possibly > indefinitely. I'd suggest being more explicit that this would just be a documented deprecation, rather than a programmatic deprecatation warning. > Addition of explicit "count and byte initialised sequence" constructors > ----------------------------------------------------------------------- > > To replace the deprecated behaviour, this PEP proposes the addition of an > explicit ``size`` alternative constructor as a class method on both > ``bytes`` and ``bytearray`` whose first argument is the count, and whose > second argument is the fill byte to use (defaults to ``\x00``):: > > >>> bytes.size(3) > b'\x00\x00\x00' > >>> bytearray.size(3) > bytearray(b'\x00\x00\x00') > >>> bytes.size(5, b'\x0a') > b'\x0a\x0a\x0a\x0a\x0a' > >>> bytearray.size(5, b'\x0a') > bytearray(b'\x0a\x0a\x0a\x0a\x0a') While I like the notion of having "size" in the name, the "noun-as-constructor" phrasing doesn't read right to me. Perhaps "fromsize" for consistency with "fromhex"? > It will behave just as the current constructors behave when passed a single > integer. This last paragraph feels incomplete now, given the expansion to allow the fill value to be specified. > Addition of "bchr" function and explicit "single byte" constructors > ------------------------------------------------------------------- > > As binary counterparts to the text ``chr`` function, this PEP proposes > the addition of a ``bchr`` function and an explicit ``fromint`` alternative > constructor as a class method on both ``bytes`` and ``bytearray``:: > > >>> bchr(ord("A")) > b'A' > >>> bchr(ord(b"A")) > b'A' > >>> bytes.fromint(65) > b'A' > >>> bytearray.fromint(65) > bytearray(b'A') Since "fromsize" would also accept an int value, "fromint" feels ambiguous here. Perhaps "fromord" to emphasise the integer is being interpreted as an ordinal bytes value, rather than as a size? The apparent "two ways to do it" here also deserves some additional explanation: - the bchr builtin is to recreate the ord/chr/unichr trio from Python 2 under a different naming scheme - the class method is mainly for the "bytearray.fromord" case, with bytes.fromord added for consistency [snip sections on accessing elements as bytes object] > Design discussion > ================= > > Why not rely on sequence repetition to create zero-initialised sequences? > ------------------------------------------------------------------------- > > Zero-initialised sequences can be created via sequence repetition:: > > >>> b'\x00' * 3 > b'\x00\x00\x00' > >>> bytearray(b'\x00') * 3 > bytearray(b'\x00\x00\x00') > > However, this was also the case when the ``bytearray`` type was originally > designed, and the decision was made to add explicit support for it in the > type constructor. The immutable ``bytes`` type then inherited that feature > when it was introduced in PEP 3137. > > This PEP isn't revisiting that original design decision, just changing the > spelling as users sometimes find the current behaviour of the binary > sequence > constructors surprising. In particular, there's a reasonable case to be made > that ``bytes(x)`` (where ``x`` is an integer) should behave like the > ``bytes.byte(x)`` proposal in this PEP. Providing both behaviours as > separate > class methods avoids that ambiguity. This note will need some tweaks to match the updated method names in the proposal. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com