[ https://issues.apache.org/jira/browse/ARROW-9456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17157216#comment-17157216 ]
Antoine Pitrou commented on ARROW-9456: --------------------------------------- cc [~jorisvandenbossche] > [Python] Dataset segfault when not importing pyarrow.parquet > ------------------------------------------------------------- > > Key: ARROW-9456 > URL: https://issues.apache.org/jira/browse/ARROW-9456 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Maarten Breddels > Priority: Major > Fix For: 1.0.0 > > > To reproduce: > # import pyarrow.parquet # if we skip this... > import pyarrow as pa > import pyarrow.dataset as ds > import glob > ds = pa.dataset.dataset('/data/taxi_parquet/data_0.parquet') > ds.to_table() # this will crash > > $ python pyarrow/crash.py dev > terminate called after throwing an instance of 'parquet::ParquetException' > what(): The file only has 19 columns, requested metadata for column: > 1049198736 > [1] 1559395 abort (core dumped) python pyarrow/crash.py > > When the import is there, it will work fine. > -- This message was sent by Atlassian Jira (v8.3.4#803005)