fhan688 opened a new pull request, #18899:
URL: https://github.com/apache/hudi/pull/18899

   ### Describe the issue this Pull Request addresses
   
     Spark SQL CTAS for Hudi tables can write incorrect values for multi-level 
partition fields when the partition columns in the SELECT output are not 
ordered the same as the table partition spec.
   
     For example, a table created with:
   
     ```sql
     partitioned by (year, month, day)
     ```
   
     can receive a CTAS query whose output is:
   
     ```sql
     select ..., month, day, year
     ```
   
     The CTAS path currently forwards the resolved query output as-is, so the 
downstream write path may interpret partition field values by position instead 
of the declared table partition order.
   
     This PR fixes the issue inline.
   
   ### Summary and Changelog
   
     This change aligns CTAS query output with the Hudi table partition field 
order before creating `CreateHoodieTableAsSelectCommand`.
   
     Changes:
     - Reorder CTAS partition attributes according to 
`table.partitionColumnNames` in `ResolveImplementationsEarly`.
     - Preserve non-partition columns in their original query output order.
     - Use Spark's session resolver for partition field matching.
     - Avoid adding a projection when the CTAS output is already aligned.
     - Add Spark SQL DDL tests for multi-level partition CTAS with both ordered 
and out-of-order partition columns.
   
   ### Impact
   
     No public API, config, or storage format changes.
   
     This fixes Spark SQL CTAS behavior for Hudi partitioned tables. CTAS now 
correctly handles multi-level partition columns even when the SELECT list 
orders partition fields differently from the `PARTITIONED BY` clause.
   
   ### Risk Level
   
     low
   
     The change is scoped to Hudi Spark SQL CTAS analysis for resolved Hudi 
tables. Non-partitioned CTAS and already-aligned CTAS plans keep the existing 
behavior. Verification was added for both COW and MOR table types
     through the existing `TestCreateTable` CTAS coverage.
   
   ### Documentation Update
   
     none
   
     This is a bug fix with no new feature, config, or public API change.
   
   ### Contributor's checklist
   
     - [x] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
     - [x] Enough context is provided in the sections above
     - [x] Adequate tests were added if applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to