kevinjqliu opened a new pull request, #16308:
URL: https://github.com/apache/iceberg/pull/16308

   Backport of #15832 to `spark/v3.4`.
   
   Adds the `output-sort-order-id` write option (and 
`SparkWriteConf.outputSortOrderId`) and threads the resolved sort-order id 
through `SparkWrite` and `SparkPositionDeltaWrite` so written data files record 
the sort order in their manifest entry. `SparkShufflingFileRewriteRunner` 
resolves the matching table sort order via `SortOrderUtil.findTableSortOrder` 
and logs a warning when no match exists.
   
   ### Adaptation note
   
   v3.4 `SparkWrite` still names the parameter `partitionedFanoutEnabled` (v3.5 
renamed it to `useFanoutWriter`). I kept the v3.4 name and added the new 
`sortOrderId` parameter alongside it. All other files match the v3.5 patch.
   
   ### Validation
   
   - `./gradlew -DsparkVersions=3.4 :iceberg-spark:iceberg-spark-3.4_2.12:test 
--tests "*TestSparkWriteConf.testSortOrder*"` (new tests pass)
   - `./gradlew -DsparkVersions=3.4 :iceberg-spark:iceberg-spark-3.4_2.12:test 
--tests "org.apache.iceberg.spark.source.TestSparkDataWrite"` (passes)
   - spark-extensions tests compile


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to