tokoko opened a new pull request, #3520:
URL: https://github.com/apache/datafusion-comet/pull/3520

   Closes #3518
   
   ## What changes are included in this PR?
   
   - Introduces a new `tryZeroCopyConvert` method in `CometArrowConverters` 
which receives `ColumarBatch` of any type and returns `ColumnarBatch` of 
`CometVector` objects if the input is composed of `ArrowColumnVector` objects, 
returns None otherwise.
   - Columnar conversion path in `CometSparkToColumnarExec` always tries 
`tryZeroCopyConvert` first and falls back to current flow if zero-copy 
conversion is impossible.
   - The implementation **ignores batchSize configuration** as it would be a 
lot more involved to do that with zero-copy... and I think zero-copy is more 
important in this case, especially if you assume that whatever operator 
produces the input will also have some similar configuration. Happy to change 
the implementation if you disagree though.
   
   ## How are these changes tested?
   
   - added tests that test conversion of hand-crafted `ColumnarBatch` objects 
as there's no out-of-box data source in spark that produces `ColumnarBatch` of 
`ArrowColumnVector` objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to