[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche updated ARROW-8010: ----------------------------------------- Summary: [Python] Fixed size list not convertible to Numpy Array / pandas Series (was: [Python] Fixed size list not convertible to Numpy Array) > [Python] Fixed size list not convertible to Numpy Array / pandas Series > ----------------------------------------------------------------------- > > Key: ARROW-8010 > URL: https://issues.apache.org/jira/browse/ARROW-8010 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Affects Versions: 0.16.0 > Environment: Ubuntu 19.10 + python 3.7 > Reporter: Paul Balanca > Priority: Major > > Fixed size list of base types (i.e. int, float, ...) are not convertible to > Numpy array. > The following code: > {code:java} > import pyarrow as pa > t = pa.list_(pa.float32(), 2) > arr = pa.array([[1, 2], [3, 4], [5, 6]], type=t) > arr.to_numpy(){code} > raises a not implemented Arrow error as there is no Pandas block equivalent. > It sounds reasonable that the conversion to Pandas fails, but I would expect > a natural conversion to Numpy Array, as according to the Fixed Size List > Layout ([https://arrow.apache.org/docs/format/Columnar.html#]), the former > could be mapped to a 2-dimensional row major matrix (e.g. 3x2 in the previous > example). > Note we can get the expected result by working around using flatten: > {code:java} > arr.flatten().to_numpy().reshape((-1, t.list_size)){code} > This form of memory representation is quite natural if ones wants to use > Apache Arrow for in-memory collection of 2D/3D points, where we wish to have > coordinates contiguous in memory. -- This message was sent by Atlassian Jira (v8.3.4#803005)