[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319059263 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1319059263 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318957040 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318957040 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318386752 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318362226 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318362226 ## src/main/thrift/parquet.thrift: ## @@ -191,6 +191,73 @@ enum FieldRepetitionType { REPEATED = 2; } +/** + * A histogram of repetition and definition

[GitHub] [parquet-format] tustvold commented on a diff in pull request #197: PARQUET-2261: add statistics for better estimating unencoded/uncompressed sizes and finer grained filtering

2023-09-07 Thread via GitHub
tustvold commented on code in PR #197: URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318314149 ## src/main/thrift/parquet.thrift: ## @@ -529,7 +596,15 @@ struct DataPageHeader { /** Encoding used for repetition levels **/ 4: required Encoding