LiaCastaneda opened a new issue, #17526:
URL: https://github.com/apache/datafusion/issues/17526

   ### Is your feature request related to a problem or challenge?
   
   In DataFusion, a HashJoin always polls the left (build) side first and then 
the right (probe) side. Under a normal execution flow, this means we do not 
need to wait on the probe side for dynamic filters to be updated, since the 
build side completes first.
   
   However, internally we have an optimization (a physical optim rule) where we 
poll data from the right side even before the left side has finished. This is 
achieved by inserting a `BufferExec` node, which improves join queries 
performance. (cc @gabotechs since you did this optimization, don't know if 
there were plans to make a contrib to DF on this)
   
   If we want to enable dynamic filtering while keeping this optimization 
active, we need a mechanism for the probe side to wait for dynamic filters to 
be populated. The simple approach we do now is to make the probe side sleep 
briefly before checking if the `DynamicFilterExpr` is ready with all filter 
values from all the partitions.
   
   
   
   ### Describe the solution you'd like
   
   I think a more robust approach would be to expose a function in the API that 
explicitly awaits the readiness of dynamic filters (i.e., waits until all 
partitions from the build side have produced their filter values), so that the 
probe side can safely proceed without doing arbitrary sleeps.
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   This is just an enhancement suggestion we came up with while testing the 
API, not a hard requirement :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to