[ https://issues.apache.org/jira/browse/ARROW-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche reassigned ARROW-6642: -------------------------------------------- Assignee: Joris Van den Bossche > [Python] chained access of ParquetDataset's metadata segfaults > -------------------------------------------------------------- > > Key: ARROW-6642 > URL: https://issues.apache.org/jira/browse/ARROW-6642 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Joris Van den Bossche > Assignee: Joris Van den Bossche > Priority: Major > Labels: parquet, pull-request-available > Fix For: 0.15.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Creating and reading a parquet dataset: > {code} > table = pa.table({'a': [1, 2, 3]}) > import pyarrow.parquet as pq > pq.write_table(table, '__test_statistics_segfault.parquet') > dataset = pq.ParquetDataset('__test_statistics_segfault.parquet') > dataset_piece = dataset.pieces[0] > {code} > If you access the metadata and a column's statistics in steps, this works > fine: > {code} > meta = dataset_piece.get_metadata() > row = meta.row_group(0) > col = row.column(0) > {code} > but doing it chained in one step, this segfaults: > {code} > dataset_piece.get_metadata().row_group(0).column(0) > {code} > {{dataset_piece.get_metadata().row_group(0)}} still works, but additionally > with {{.column(0)}} then it segfaults. -- This message was sent by Atlassian Jira (v8.3.4#803005)