CC: @wesm This is very interesting. I thought Python bytes are immutable (hence the copy assumption) but pyarrow plays with fire here, IMHO: ``` >>> import pyarrow as pa >>> import numpy as np >>> a = b'123456' >>> a[0] = 77 # bytes are immutable ... TypeError: 'bytes' object does not support item assignment >>> b = pa.py_buffer(a) # but with pyarrow bytes can be changed in-situ >>> arr = np.frombuffer(b, dtype=np.uint8) >>> arr array([49, 50, 51, 52, 53, 54], dtype=uint8) >>> arr[0] = 66 # change immutable 'a' in place >>> a b'B23456' >>> hash(a) -4581532003987476523 >>> arr[0] = 77 >>> a b'M23456' >>> hash(a) # hash value is constant -4581532003987476523 ``` But note that numpy respects immutability: ``` >>> arr2 = np.frombuffer(a, dtype=np.uint8) >>> arr2 array([77, 50, 51, 52, 53, 54], dtype=uint8) >>> arr2[0] = 88 ... ValueError: assignment destination is read-only ```
[ Full content available at: https://github.com/apache/arrow/pull/2536 ] This message was relayed via gitbox.apache.org for [email protected]
