andygrove opened a new pull request, #3773:
URL: https://github.com/apache/datafusion-comet/pull/3773

   ## Which issue does this PR close?
   
   Closes #3770.
   
   ## Rationale for this change
   
   `CometArrowConverters` reused Arrow buffers between batches via 
`arrowWriter.reset()` on a shared `VectorSchemaRoot`. This was intended to 
reduce JVM allocations, but because `ScanExec` always performed a full deep 
copy (`copy_array`) when `arrow_ffi_safe=false`, both JVM and native 
allocations existed simultaneously with no benefit — the optimization was 
counterproductive.
   
   ## What changes are included in this PR?
   
   - **Remove buffer reuse** in `CometArrowConverters`: each batch now gets a 
fresh `VectorSchemaRoot`. The JVM-side root is closed immediately after export; 
native ref-counting keeps the memory alive until the native plan releases the 
arrays.
   - **Remove `CopyMode` enum** from `copy.rs` and simplify 
`copy_or_unpack_array` to always `Arc::clone` non-dictionary arrays (dictionary 
arrays are still unpacked via cast+copy).
   - **Remove `arrow_ffi_safe` field** from the `Scan` proto message, 
`ScanExec` struct, and all call sites.
   - **Remove `isFfiSafe` overrides** from `CometLocalTableScanExec` and 
`CometSparkToColumnarExec`.
   - **Remove `isFfiSafe` method and `setArrowFfiSafe` call** from `CometSink`.
   - **Remove `.setArrowFfiSafe(false)`** from `CometDataWritingCommand`.
   - **Update docs** in `ffi.md` and `adding_a_new_operator.md` to reflect the 
new ownership model.
   
   With fresh buffers per batch, native code can safely `Arc::clone` instead of 
deep copying — a genuine performance improvement.
   
   ## How are these changes tested?
   
   Covered by existing tests (`cargo test`, `make test-jvm`). No new behavior 
is introduced; this removes an unnecessary deep copy on the native side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to