akurmustafa commented on issue #18202: URL: https://github.com/apache/datafusion/issues/18202#issuecomment-3435012243
Can you share the physical plans with and without the `EnforceDistribution` rule. That would help to pinpoint problem much easier. I didn't verify but, my suspicion is that when `EnforceDistribution` rule is on. Generated `PhysicalPlan` is multi partitioned. Likely, partitioning is done according to hash value of the column `code` (Since you have `partition by code`). However, cardinality for the `code` is very low, which is 3. Also likely, `DataFusion` allocates 8 partitions in the final plan. Hence, you have empty partitions during execution. Top operators wait data from empty partitions, infinitely and at some point you get a timeout error. However, this is just my guess I didn't reproduce the code. If you share the plans, we can discuss further to understand the reason. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
