adriangb commented on PR #21882: URL: https://github.com/apache/datafusion/pull/21882#issuecomment-4437114961
IIRC that test was added when we added spilling to RepartitionExec. Conceptually the test is simple: if RepartitionExec is configured to preserve order and it spills we need to make sure that spilling did not shuffle the data. The orchestration however is difficult: forcing a RepartitionExec to spill usually requires skewed upstream partition consumption rates. You could try to change the test to eg use a GroupBy or maybe we can use a RepartitionExec in isolation if we pull from the streams in the right way. I think the structure can be changed quite a bit as long as we preserve the semantic meaning of the test, I am not surprised that it is pretty fragile to changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
