robtandy commented on issue #46:
URL: https://github.com/apache/datafusion-ray/issues/46#issuecomment-2596051911

   The current implementation i have streams object references out of each 
partition through an intermediate Actor to hold the queues.    Then subsequent 
stages connect to the intermediate actor and read streaming object references 
from the queue.   They then have to `ray.get()` on the references to 
materialize the IPC encoded batch.
   
   I was really hoping for more performance and scale out of this approach, and 
I particularly like that it reuses the RepartitionExec's as implemented in 
DataFusion.   But alas.
   
   My work isn't documented well yet but if you have time to discuss it sync, 
I'd appreciate your eyes on it and how we can improve it.  Let me know if you 
have time to pair program for a bit so I can demo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to