yjshen commented on pull request #1104:
URL: https://github.com/apache/arrow-datafusion/pull/1104#issuecomment-946339508


   > also cc @yjshen in case we missed any item needed from your native spark 
executor work.
   
   Thanks, @houqp. I think what I need most is covered by the `Resource 
Management` section. I'm working on prototyping a memory limit version of 
`SortExec` currently.
   
   On the Ballista side, I feel Broadcast join is great to add. Besides, we 
could have a sort-based shuffle writer for memory usage friendly and have a 
single map output file for each task to avoid creating too many small files 
when the output partition number is significant.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to