CC: @wesm 
This is very interesting. I thought Python bytes are immutable (hence the copy 
assumption) but pyarrow plays with fire here, IMHO:
```
>>> import pyarrow as pa
>>> import numpy as np
>>> a = b'123456'
>>> a[0] = 77 # bytes are immutable
...
TypeError: 'bytes' object does not support item assignment
>>> b = pa.py_buffer(a)  # but with pyarrow bytes can be changed in-situ
>>> arr = np.frombuffer(b, dtype=np.uint8)
>>> arr
array([49, 50, 51, 52, 53, 54], dtype=uint8)
>>> arr[0] = 66 # change immutable 'a' in place
>>> a
b'B23456'
>>> hash(a)
-4581532003987476523
>>> arr[0] = 77
>>> a
b'M23456'
>>> hash(a) # hash value is constant
-4581532003987476523
```
But note that numpy respects immutability:
```
>>> arr2 = np.frombuffer(a, dtype=np.uint8)
>>> arr2
array([77, 50, 51, 52, 53, 54], dtype=uint8)
>>> arr2[0] = 88
...
ValueError: assignment destination is read-only
```

[ Full content available at: https://github.com/apache/arrow/pull/2536 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to