adriangb commented on PR #18207: URL: https://github.com/apache/datafusion/pull/18207#issuecomment-3470369777
@2010YOUY01 this is ready for review now. There are existing [unit style tests that manually push data through some partitions but not others](https://github.com/apache/datafusion/pull/18014/files#diff-e8fa07bcbe7db22bdd43eacfa713d380acaf651770da26cb06957030034ebb93) and demonstrate that if there were some imbalances the spilling we've added to RepartitionExec would allow the query to continue instead of failing. I added an integration style test that triggered this with a real world query (one that was already used in other SLT tests actually) but [I had to remove it because the order was not deterministic](https://github.com/apache/datafusion/pull/18207/commits/d7af8efb06bae772f71e7ffe4f06cb5867ec94b2) and I could not find a way to make the order deterministic and get spilling. However it should be reproducible outside of this change if you'd want. One thing I'll note about this change specifically and the existing spilling infrastructure that was a bit disappointing: it seems like it's not possible to read and write from the same spill file. That is, we need to finish writing a spill file before we can read from it. This means that the read side needs wait for a spill file to be full or for the stream to end before it can start reading, which is bound to add latency. I don't think that is a limitation of this change, rather a limitation of the existing `InProgressSpillFile`, etc. I decided not to fight it any more but if you think this interpretation is wrong and it is possible to use an active file as a FIFO read/write queue I'm open to exploring more. Overall I'm not 100% certain what positive impact this will have in the real world, it's very hard to trigger specific memory scenarios in a real query. However I think this is an improvement on the per-batch spilling that we added in #18014 which was in turn an improvement on the query failing. I also think the structures added here are pretty isolated but also maybe re-usable and nicely packaged up. They'd be easy to use in other operators if they prove to work well or easy to rip out of RepartitionExec if they don't. So there is little harm in adding them. I see this as part of the strategy of "get as many operators to implement spilling as possible", which is complementary to other [interesting big picture discussions around memory management](https://github.com/apache/datafusion/issues/17334#issuecomment-3447056234). Personally I would like to close the chapter on RepartitionExec and move onto thinking about those other big picture bits. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
