Guido van Rossum wrote:
> [PEP 3137]
>>> **Open Issue:** I'm undecided on whether indexing bytes and buffer
>>> objects should return small ints (like the bytes type in 3.0a1, and
>>> like lists or array.array('B')), or bytes/buffer objects of length 1
>>> (like the str type). The latter (str-like) approach will ease porting
>>> code from Python 2.x; but it makes it harder to extract values from a
>>> bytes array.
>
> On 9/26/07, Brett Cannon <[EMAIL PROTECTED]> wrote:
>> How much do you care about making the 2 -> 3 transition easy? If you
>> don't go the str way then comparisons like ``bytes_[0] == b"A"`` won't
>> work unless you allow comparisons between ints and length 1
>> bytes/buffers. Extracting a single item is not horrendous if you pass
>> it to int().
>>
>> Personally I say go with the list-like semantics. Having the
>> following code return false seems odd (but not ridiculous) to me::
>>
>> stuff = bytes([0, 1])
>> stuff[1] = 42
>> stuff[1] == 42
>>
>> So unless int comparisons are allowed I am -0 on the str-like semantics.
>
> int comparisons would stick out like a sore thumb, especially since
> they can only be reasonably made to work on 1-byte strings.
>
> I'm still undecided (despite Marcin's eloquent argument for ints as
> bytes) but I'm open for votes for this case.
Making an iterator over an integer sequence acceptable in the
constructor strongly suggests that a byte sequence contains integers
between 0 and 255 inclusive, not length 1 byte sequences.
And I think that's the cleanest conceptual model for them as well. A
byte sequence doesn't contain length 1 byte sequences, it contains bytes
(i.e. numbers between 0 and 255 inclusive).
For direct comparison, a slice works fine:
if data[0:1] == b'x':
print "Starts with x!"
The only problematic case is cases such as iterating over a byte
sequence where we may have an integer and want to compare it to a length
1 byte string. With just the simple conceptual model, we would have to
write one of:
if val == b'x'[0]:
if bytes([val]) == b'x':
if val == ord(b'x'):
I don't think it's worth breaking the conceptual model of the data type
just to reduce the simplest spelling of that comparison by 3 characters.
However, I do think it may be worth having an additional iterator on
bytes and buffer objects:
def fragments(self, size=1): # Could do with a better name
for i in range(len(self)):
yield self[i:i+size]
Then the problematic example could be written:
for val in data.fragments():
if val == b'x':
print "Found an x!"
Cheers,
Nick.
--
Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com