Joris Van den Bossche created ARROW-7547: --------------------------------------------
Summary: [C++] [Python] [Dataset] Additional reader options in ParquetFileFormat Key: ARROW-7547 URL: https://issues.apache.org/jira/browse/ARROW-7547 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset, Python Reporter: Joris Van den Bossche [looking into using the datasets machinery in the current python parquet code] In the current python API, we expose several options that influence reading the parquet file (eg {{read_dictionary}} to indicate to read certain BYTE_ARRAY columns directly into a dictionary type, or {{memory_map}}, {{buffer_size}}). Those could be added to {{ParquetFileFormat}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)