zhuqi-lucas commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2949124590
> @zhuqi-lucas, I took the liberty of parametrizing the tests and hardening them. We now have two failing (ignored) tests (the filter test, which is probably trivial to fix and more related to filter itself, and a pure interleave test without the aggregation). > > The test harness is now good for a first cut. I hope to see whether our approach bears fruit next week, or learn whether we face some fundamental issue. Thank you @ozankabak , it makes sense to me. And i will also think about the solution for the fail test case. > > Tests pass even if add in the "pretending" (because the join code seems to yield naturally) > > The hash join test I have does fail so I dug into this. It's passing for you for two reasons: > > * There's an aggregation in that test, so the plan has emission type final and the yield wrapper gets injected > * If you remove the aggregation there's still a RepartitionExec which itself has an ad hoc version of the cooperative yield logic in place at https://github.com/apache/datafusion/blob/main/datafusion/physical-plan/src/repartition/mod.rs#L980. It also decouples producer and consumer via the distribution channels. If the consumer drains faster than the producer fills you'll get a natural pending that way. Thank you @pepijnve , i agree, i still didn't add the corner case for join, i think it will also happen similar to filter, but the solution may be similar. I will remove the aggregation for the join corner case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
