[ https://issues.apache.org/jira/browse/PARQUET-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961207#comment-16961207 ]
Xinli Shang commented on PARQUET-1685: -------------------------------------- Sounds good [~gszadovszky] and [~rdblue]. I will broadcast in 'dev' email list to see if there is any objection, or if people knows any application relying on the real value of min/max value. > Truncate the stored min and max for String statistics to reduce the footer > size > -------------------------------------------------------------------------------- > > Key: PARQUET-1685 > URL: https://issues.apache.org/jira/browse/PARQUET-1685 > Project: Parquet > Issue Type: Improvement > Components: parquet-mr > Affects Versions: 1.10.1 > Reporter: Xinli Shang > Assignee: Xinli Shang > Priority: Major > Fix For: 1.12.0 > > > Iceberg has a cool feature that truncates the stored min, max statistics to > minimize the metadata size. We can borrow to truncate them in Parquet also to > reduce the size of the footer, or even the page header. Here is the code in > IceBerg > [https://github.com/apache/incubator-iceberg/blob/master/api/src/main/java/org/apache/iceberg/util/UnicodeUtil.java]. > > > > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005)