ctsk commented on PR #16153: URL: https://github.com/apache/datafusion/pull/16153#issuecomment-2906567750
This optimization is neat and already covers the common case of joins on primary keys. I think we can further optimize the join hash table - even for cases where *some* keys might have chains. Instead of looking for a 0 value in the next table, we can encode whether there is a next value in the top bit of the current value - thus saving a lookup in the next array on every probe that has at least a single match. I don't know how well this plays with the streaming join hash map though =) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org