deamontg opened a new issue #9160:
URL: https://github.com/apache/arrow/issues/9160
I'm trying to read a parquet file using pyarrow read_table(), and I would
like to filter columns using None. I've tried something like the following:
```
import pyarrow as pa
import pyarrow.parquet as pq
table = pa.Table.from_arrays([[None, None, 'a', 'b', 'a', 'c']],
names=['column'])
pq.write_table(table, 'data.parquet')
table = pq.read_table('data.parquet', filters=[[('column', '=', None)]])
```
This example does not work, as the read-in table contains no records. How
can I properly filter a column for None values when reading a table?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]