LiaCastaneda commented on PR #19761:
URL: https://github.com/apache/datafusion/pull/19761#issuecomment-3935106429

   👋 Something I noticed while using `BufferExec` in our service is that when 
the build side of a `HashJoinExec` (`INNER` join) returns 0 rows, the probe 
side is still fully consumed. IIUC The short-circuit 
[here](https://github.com/apache/datafusion/blob/ace9cd44b7356d60e6d69d0b98ac3f5606d55507/datafusion/physical-plan/src/joins/hash_join/stream.rs#L647)
 only skips the hash join lookup work, but `fetch_probe_batch` still runs for 
every probe batch until the stream is exhausted. I think this happens 
regardless there is a `BufferExec` or not
   
   Would it make sense to detect an empty build side right after 
`collect_build_side` completes, and for join types where empty build --> empty 
output , drop the probe stream immediately and jump to `Completed`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to