Joris Van den Bossche created ARROW-9459:
--------------------------------------------

             Summary: [C++][Dataset] Make collecting/parsing statistics 
optional for ParquetFragment
                 Key: ARROW-9459
                 URL: https://issues.apache.org/jira/browse/ARROW-9459
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Joris Van den Bossche


See some timing checks here: 
https://github.com/dask/dask/pull/6346#issuecomment-656548675

Parsing all statistics, even from a centralized {{_metadata}} file can be quite 
expensive. If you know in advance that you are not going to use them (eg you 
are only going to do filtering on the partition fields, and otherwise read all 
data), it could be nice to have an option to disable parsing statistics.

cc [~rjzamora] [~bkietz] [~fsaintjacques]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to