The GitHub Actions job "Flink CI" on iceberg.git/fix/prune-columns-nested-fields has failed. Run started by GitHub user IgorBerman (triggered by IgorBerman).
Head commit for run: 610fd128c1a3388a14696e3706570c380b51bc16 / Igor Berman <[email protected]> Parquet: Fix nested field pruning in PruneColumns When pruning nested structures (lists, maps, structs), the PruneColumns visitor was incorrectly returning the original unpruned field when the container's field ID was in the selectedIds set, even when child fields had been pruned. This fix ensures that: 1. In struct(): When a field is selected and has been pruned (field != originalField), use the pruned version instead of the original. 2. In list(): Check for pruned element first before checking if elementId is selected, ensuring nested pruning is applied. 3. In map(): Similarly check for pruned value before checking selected keys/values. 4. Add validatePrunedField() to verify pruned fields maintain compatibility with original fields (same name, ID, and repetition). This enables proper column pruning for deeply nested schemas like: list<struct<field1, nested_list: list<struct<a, b, c, d>>>> When projecting only field1 and nested_list[].a, b, the fix ensures fields c and d are properly pruned from the Parquet projection schema. Note: When a struct is explicitly selected (SELECT struct_field, struct_field.sub_field), the full struct is preserved because field == originalField in that case. Report URL: https://github.com/apache/iceberg/actions/runs/19920312914 With regards, GitHub Actions via GitBox
