Re: [I] support dynamic filtering on partitioned data from file source [datafusion]

via GitHub Tue, 31 Mar 2026 15:44:24 -0700


gene-bordegaray commented on issue #20195:
URL: https://github.com/apache/datafusion/issues/20195#issuecomment-4164369724


   > Keep in mind there are two different partitioning: the hash table 
partitioning is explicitly different (different hash seed) from the repartition 
/ stream level partitioning.
   
   Yes and we highlighted this. The conclusion that I and think others came to 
is that a join sees that each side declares Hash partitioning on the same keys 
then DF should be able to just route the filter of partition x on the build 
side to partition x on the probe side.
   
   If a user wants to declare hash partitioning on their data, it is their 
responsibility to ensure they uphold the promises they make to get correct 
results.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] support dynamic filtering on partitioned data from file source [datafusion]

Reply via email to