[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056573#comment-17056573 ]
Wes McKinney commented on ARROW-7907: ------------------------------------- This looks like it was fixed in https://github.com/apache/arrow/commit/6ff156972ac426ef88b1e6674b975a6c61ef852d. I'll add a unit test to exercise the 0-length slice path > [Python] Conversion to pandas of empty table with timestamp type aborts > ----------------------------------------------------------------------- > > Key: ARROW-7907 > URL: https://issues.apache.org/jira/browse/ARROW-7907 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Joris Van den Bossche > Priority: Major > Fix For: 0.17.0 > > > Creating an empty table: > {code} > In [1]: table = pa.table({'a': pa.array([], type=pa.timestamp('us'))}) > > > In [2]: table['a'] > > > Out[2]: > <pyarrow.lib.ChunkedArray object at 0x7fbb783e8098> > [ > [] > ] > In [3]: table.to_pandas() > > > Out[3]: > Empty DataFrame > Columns: [a] > Index: [] > {code} > the above works. But the ChunkedArray still has 1 empty chunk. When filtering > data, you can actually get no chunks, and this fails: > {code} > In [4]: table2 = table.slice(0, 0) > > > In [5]: table2['a'] > > > Out[5]: > <pyarrow.lib.ChunkedArray object at 0x7fbb783aa4a8> > [ > ] > In [6]: table2.to_pandas() > > > ../src/arrow/table.cc:48: Check failed: (chunks.size()) > (0) cannot > construct ChunkedArray from empty vector and omitted type > ... > Aborted (core dumped) > {code} > and this seems to happen specifically for timestamp type, and specifically > with non-ns unit (eg with us as above, which is the default in arrow). > I noticed this when reading a parquet file of the taxi dataset, where the > filter I used resulted in an empty batch. -- This message was sent by Atlassian Jira (v8.3.4#803005)