kosiew opened a new pull request, #21284:
URL: https://github.com/apache/datafusion/pull/21284

   ## Which issue does this PR close?
   
   * Part of #20835
   
   ---
   
   ## Rationale for this change
   
   DataFusion already supports recursive schema evolution for several nested 
container types (e.g., List, LargeList, ListView, Dictionary), but 
`FixedSizeList` containing `Struct` values was missing equivalent support.
   
   This gap led to inconsistent behavior where:
   
   * Planner and runtime casting logic were not aligned
   * Valid schema evolution scenarios (e.g., additive nullable fields) were not 
supported
   * Error handling for incompatible changes was incomplete or inconsistent
   
   This PR addresses these issues by introducing full support for 
`FixedSizeList<Struct>` in both planning and execution paths, ensuring 
consistent and predictable behavior.
   
   ---
   
   ## What changes are included in this PR?
   
   * **Runtime casting support**
   
     * Added `cast_fixed_size_list_column` to recursively cast `FixedSizeList` 
inner values
     * Enforces size equality between source and target lists
     * Reuses existing nested struct casting logic for inner fields
   
   * **Refactoring and reuse**
   
     * Introduced `downcast_list_array` helper to unify array downcasting logic 
across list types
     * Simplified existing `List` and `ListView` casting implementations
   
   * **Validation enhancements**
   
     * Extended `validate_data_type_compatibility` to:
   
       * Enforce fixed-size equality before validating nested fields
       * Ensure consistent failure modes for incompatible schema evolution
   
   * **Planner/runtime parity**
   
     * Updated `requires_nested_struct_cast` to include `FixedSizeList`
     * Ensured planner checks mirror runtime behavior
     * Clarified intent via comments in planner and runtime paths
   
   * **Test coverage**
   
     * Added unit tests for:
   
       * Incompatible nested type changes
       * Non-nullable field additions
       * Fixed-size mismatch failures
       * `requires_nested_struct_cast` behavior
     * Added sqllogictest coverage for:
   
       * Additive nullable fields
       * Null and all-null cases
       * Field reordering
       * Incompatible nested type failures
   
   ---
   
   ## Are these changes tested?
   
   Yes.
   
   * **Unit tests** validate:
   
     * Failure cases (size mismatch, incompatible types, non-nullable additions)
     * Correct detection of nested struct casting requirements
   
   * **End-to-end sqllogictests** validate:
   
     * Real-world schema evolution scenarios for `FixedSizeList<Struct>`
     * Additive evolution with null backfilling
     * Field reordering compatibility
     * Proper error propagation for incompatible schemas
   
   These tests demonstrate both previous failure modes and correct behavior 
after the implementation.
   
   ---
   
   ## Are there any user-facing changes?
   
   Yes.
   
   * Users can now perform schema evolution on Parquet data involving 
`FixedSizeList<Struct>` columns.
   
   * Supported scenarios include:
   
     * Adding nullable fields
     * Field reordering
     * Handling null values
   
   * The following constraints are now enforced consistently:
   
     * FixedSizeList lengths must match
     * Incompatible nested type changes are rejected
     * Non-nullable field additions without defaults fail
   
   No breaking API changes are introduced.
   
   ---
   
   ## LLM-generated code disclosure
   
   This PR includes LLM-generated code and comments. All LLM-generated content 
has been manually reviewed and tested.
   
   ---
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to