alliasgher opened a new pull request, #875:
URL: https://github.com/apache/iceberg-go/pull/875

   ## Summary
   
   The `SortOrderID` field on `WriteTask` was declared but never set by any 
producer, so every data file written via the 
standard/partitioned/rolling/equality-delete paths ended up with a nil 
`sort_order_id` in the manifest entry. This leaves readers with no way to know 
which sort order the file was written against.
   
   This PR threads `SortOrderID` end-to-end:
   
   * `WriteFileInfo` gains a `SortOrderID` field and 
`DataFileStatistics.ToDataFile` now calls `DataFileBuilder.SortOrderID` so the 
value lands on the final `DataFile`.
   * `arrow_utils.filesToDataFiles`, `writer.writeFile`, and `writerFactory` 
all pass the table's default sort order id down to `WriteFileInfo`, and the 
equality-delete / position-delete / rolling data writer paths populate the 
`WriteTask.SortOrderID` accordingly.
   
   Fixes #842
   
   ## Test plan
   
   - [x] `go test ./table/...`
   - [x] `go vet ./...`
   - [x] New assertion: `withSortOrderIDMatching` added to 
`defaultPositionDeleteMatching` in 
`TestPositionDeletePartitionedFanoutWriterProcessBatch`; `mockDataFile` gains 
an overridable sort order id so this path is covered end-to-end.
   
   Signed-off-by: Ali <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to