[ https://issues.apache.org/jira/browse/IMPALA-10879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Attila Jeges resolved IMPALA-10879. ----------------------------------- Resolution: Implemented > Add parquet stats to iceberg manifest > ------------------------------------- > > Key: IMPALA-10879 > URL: https://issues.apache.org/jira/browse/IMPALA-10879 > Project: IMPALA > Issue Type: Improvement > Components: Backend, Frontend > Affects Versions: Impala 4.0.0 > Reporter: Attila Jeges > Assignee: Attila Jeges > Priority: Major > Labels: impala-iceberg > > Parquet stats should be written to iceberg manifest as per-datafile metrics. > This task is specifically about the following metrics: > - column_sizes : Map from column id to the total size on disk of all regions > that store the column. Does not include bytes necessary to read other > columns, like footers. Leave null for row-oriented formats > - null_value_counts : Map from column id to number of null values in the > column. > - lower_bounds : Map from column id to lower bound in the column serialized > as binary. Each value must be less than or equal to all non-null, non-NaN > values in the column for the file. > - upper_bounds : Map from column id to upper bound in the column serialized > as binary. Each value must be greater than or equal to all non-null, non-Nan > values in the column for the file. > Iceberg manifest doc: > https://iceberg.apache.org/spec/#manifests > lower_bounds and upper_bounds values should be Single-value serialized to > binary: > https://iceberg.apache.org/spec/#appendix-d-single-value-serialization -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org