andygrove opened a new pull request, #3666:
URL: https://github.com/apache/datafusion-comet/pull/3666

   ## Summary
   
   - Disable native columnar-to-row conversion for plans containing 
`CometBatchScanExec` to fix correctness issues with Iceberg merge-on-read 
deletes
   - Rename `hasScanUsingMutableBuffers` to `hasScanIncompatibleWithNativeC2R` 
to reflect its broadened scope
   
   ## Root Cause
   
   When Iceberg applies merge-on-read position deletes, columns get wrapped in 
`CometSelectionVector` which maps logical row indices to physical row indices 
via a `rowIdMapping` array. The native C2R path (`NativeUtil.exportBatch`) 
exports the **raw underlying Arrow array** without applying this mapping, then 
tells native code to process `numValues()` rows sequentially. This reads the 
first N physical rows instead of the rows at the mapped indices, producing 
wrong data.
   
   The JVM C2R works correctly because `CometSelectionVector.getInt(i)` 
internally remaps through `selectionIndices[i]` to access the correct physical 
row.
   
   ## Fix
   
   Expand the native C2R compatibility check in `EliminateRedundantTransitions` 
to also detect `CometBatchScanExec` (which wraps external V2 readers like 
Iceberg's `BatchScanExec`). When detected, fall back to 
`CometColumnarToRowExec` (JVM-based).
   
   ## Test plan
   
   - [x] `CometNativeColumnarToRowSuite` — 26 tests pass
   - [x] `CometExecSuite` — 88 tests pass
   - [x] No golden file changes needed (`CometBatchScanExec` is not used in 
TPC-DS plans)
   - [ ] Verify Iceberg-Java integration tests pass in CI (tag with `[iceberg]`)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to