Re: [PR] Add spilling to RepartitionExec [datafusion]

via GitHub Tue, 21 Oct 2025 10:34:46 -0700


adriangb commented on PR #18014:
URL: https://github.com/apache/datafusion/pull/18014#issuecomment-3428138106


   > What I suggest is:
   > 
   > 1. Merge this PR
   > 2. File a follow on ticket describing a more sophisticated algorithm for 
reading/writing spill files to reduce the syscall overhead, (e.g. writing 
multiple batches to the same file when possible, etc) leaving a ticket 
reference in the comments
   > 3. Maybe add a logging message or some other hint when the spilling 
happens to help debugging
   
   I have a followup already: https://github.com/pydantic/datafusion/pull/40
   
   I'll go ahead and merge this PR to avoid needing re-review and we can 
continue in that one
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add spilling to RepartitionExec [datafusion]

Reply via email to