Gabe Joseph created ARROW-4675:
----------------------------------
Summary: [Python] Error serializing bool ndarray in py2 and
deserializing in py3
Key: ARROW-4675
URL: https://issues.apache.org/jira/browse/ARROW-4675
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.12.0
Environment: * pyarrow 0.12.0
* numpy 1.16.1
* Python 3.7.0, 2.7.15
* (macOS 10.13.6)
Reporter: Gabe Joseph
{{np.bool}} is the only dtype I've found that causes this issue. Both empty and
non-empty arrays cause it.
The issue only manifests from py2 to py3; staying within the same version
succeeds, as does serializing from py3 and deserializing in py2.
This appears to just be due to Python 2 {{str}} being deserialized in Python 3
as {{bytes}}; it should be {{unicode}} on the py2 end to come back as {{str}}
in py3. I suppose something in the serialization implementation is writing the
dtype (just for bool arrays?) using a {{str}}, but haven't dug into it yet.
{code:bash}
(two)bash-3.2$ python cereal.py
(two)bash-3.2$ cat cereal.py
# Python 2
import numpy as np
import pyarrow as pa
data = np.array([], dtype=np.dtype('bool'))
buf = pa.serialize(data).to_buffer()
outstream = pa.output_stream("buffer")
outstream.write(buf)
outstream.close()
# ...switch to python 3 venv...
(three)bash-3.2$ cat decereal.py
# Python 3
import numpy as np
import pyarrow as pa
instream = pa.input_stream("buffer")
buf = instream.read()
data = pa.deserialize(buf)
print(data)
(three)bash-3.2$ python3 decereal.py
Traceback (most recent call last):
File "decereal.py", line 10, in <module>
data = pa.deserialize(buf)
File "pyarrow/serialization.pxi", line 448, in pyarrow.lib.deserialize
File "pyarrow/serialization.pxi", line 411, in pyarrow.lib.deserialize_from
File "pyarrow/serialization.pxi", line 262, in
pyarrow.lib.SerializedPyObject.deserialize
File "pyarrow/serialization.pxi", line 175, in
pyarrow.lib.SerializationContext._deserialize_callback
TypeError: can only concatenate str (not "bytes") to str
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)