[ https://issues.apache.org/jira/browse/ARROW-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Uwe L. Korn updated ARROW-2057: ------------------------------- Description: It would be useful to be able to set the size of data pages (within Parquet column chunks) from Python. The current default is set to 1MiB at https://github.com/apache/parquet-cpp/blob/0875e43010af485e1c0b506d77d7e0edc80c66cc/src/parquet/properties.h#L81. It might be useful in some situations to lower this for more granular access. We should provide this value as a parameter to {{pyarrow.parquet.write_table}}. was:It would be useful to be able to set the size of data pages (within Parquet column chunks) from Python > [Python] Configure size of data pages in pyarrow.parquet.write_table > -------------------------------------------------------------------- > > Key: ARROW-2057 > URL: https://issues.apache.org/jira/browse/ARROW-2057 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Wes McKinney > Assignee: Uwe L. Korn > Priority: Major > Labels: beginner > Fix For: 0.10.0 > > > It would be useful to be able to set the size of data pages (within Parquet > column chunks) from Python. The current default is set to 1MiB at > https://github.com/apache/parquet-cpp/blob/0875e43010af485e1c0b506d77d7e0edc80c66cc/src/parquet/properties.h#L81. > It might be useful in some situations to lower this for more granular access. > We should provide this value as a parameter to > {{pyarrow.parquet.write_table}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)