adriangb commented on PR #18014: URL: https://github.com/apache/datafusion/pull/18014#issuecomment-3428138106
> What I suggest is: > > 1. Merge this PR > 2. File a follow on ticket describing a more sophisticated algorithm for reading/writing spill files to reduce the syscall overhead, (e.g. writing multiple batches to the same file when possible, etc) leaving a ticket reference in the comments > 3. Maybe add a logging message or some other hint when the spilling happens to help debugging I have a followup already: https://github.com/pydantic/datafusion/pull/40 I'll go ahead and merge this PR to avoid needing re-review and we can continue in that one -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
