gene-bordegaray commented on issue #20195: URL: https://github.com/apache/datafusion/issues/20195#issuecomment-4164369724
> Keep in mind there are two different partitioning: the hash table partitioning is explicitly different (different hash seed) from the repartition / stream level partitioning. Yes and we highlighted this. The conclusion that I and think others came to is that a join sees that each side declares Hash partitioning on the same keys then DF should be able to just route the filter of partition x on the build side to partition x on the probe side. If a user wants to declare hash partitioning on their data, it is their responsibility to ensure they uphold the promises they make to get correct results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
