Joris Van den Bossche created ARROW-5436:
--------------------------------------------

             Summary: [Python] expose filters argument in parquet.read_table
                 Key: ARROW-5436
                 URL: https://issues.apache.org/jira/browse/ARROW-5436
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Python
            Reporter: Joris Van den Bossche
             Fix For: 0.14.0


Currently, the {{parquet.read_table}} function can be used both for reading a 
single file (interface to ParquetFile) as a directory (interface to 
ParquetDataset). 

ParquetDataset has some extra keywords such as {{filters}} that would be nice 
to expose through {{read_table}} as well.

Of course one can always use {{ParquetDataset}} if you need its power, but for 
pandas wrapping pyarrow it is easier to be able to pass through keywords just 
to {{parquet.read_table}} instead of calling either {{read_table}} or 
{{ParquetDataset}}. Context: https://github.com/pandas-dev/pandas/issues/26551



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to