Ben Epstein created ARROW-14320: ----------------------------------- Summary: Pyarrow array to_numpy array corrupts numpy dtype Key: ARROW-14320 URL: https://issues.apache.org/jira/browse/ARROW-14320 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 5.0.0 Reporter: Ben Epstein
When converting a single-dimensional array to numpy, the dtype is preserved {code:java} import pyarrow as pa x = pa.array([.234,.345,.456]) x.to_numpy().dtype # dtype('float64'){code} But when doing the same for a multi-dimensional array, the dtype is lost *and cannot be set manually* {code:java} x = pa.array([[1,2,3],[4,5,6]]).to_numpy(zero_copy_only=False) print(x.dtpye) # object x.astype(np.float64) # ValueError: setting an array element with a sequence.{code} Which is to say that numpy believes this array is not uniform. The only way to get it to the proper dtype is to convert it to a python list then back to a numpy array. Is there another way to achieve this? I know that pyarrow doesn't support ndarrays with ndim>1 (https://issues.apache.org/jira/browse/ARROW-5645) but I was curious if this can be achieves going the other way. -- This message was sent by Atlassian Jira (v8.3.4#803005)