hcrosse opened a new issue, #15677: URL: https://github.com/apache/iceberg/issues/15677
### Feature Request / Improvement ### Feature Request / Improvement The `write.parquet.*` table properties cover most Parquet writer settings (compression, page size, etc.) but there is no property to control the [DataPage version](https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift#L653-L657) (V1 vs V2). Not all consumers fully support V2, leading to incompatibilities when different Iceberg writers produce files with different page versions. iceberg-java exposes control over this via [`WriteBuilder.writerVersion()`](https://github.com/apache/iceberg/blob/main/parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java#L268). iceberg-go [hardcodes V2](https://github.com/apache/iceberg-go/blob/main/table/internal/parquet_files.go#L210), while pyiceberg and iceberg-rust inherit their Arrow library defaults (V1 in both cases). A standard table property would let this be configured at the table level, rather than relying on implementation-specific APIs. Proposed property: | Property | Default | Description | |----------|---------|-------------| | `write.parquet.page-version` | Suggested: `1` | Parquet data page version. Supported values: `1` (DataPage V1), `2` (DataPage V2). | A default of `1` would match the current output of iceberg-java, pyiceberg, and iceberg-rust. This follows the existing pattern of `write.parquet.*` properties for compression, page size, etc. Table-level configuration would enable more consistent output across writers that adopt it. Also lmk if this should be moved to a proposal - the scope seemed small but it is a spec change. ### Query engine None ### Willingness to contribute - [x] I can contribute this improvement/feature independently - [x] I would be willing to contribute this improvement/feature with guidance from the Iceberg community - [ ] I cannot contribute this improvement/feature at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
