[ https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838246#comment-16838246 ]
Lai Zhou commented on CALCITE-2973: ----------------------------------- [~zabetak], the query as you said, {code:java} SELECT e.name FROM emp e INNER JOIN department d ON e.address.zipcode = d.zipcode{code} I add a test for it, and I found the RexFieldAccess `e.address.zipcode` would be converted to a new RexInputRef , that was made by JoinPushExpressionsRule, see [https://github.com/apache/calcite/blob/6afa38bae794462e6e250237a1b60cc4220b2885/core/src/main/java/org/apache/calcite/plan/RelOptUtil.java#L3290]. Please see the latest commit, there's a test named `leftOuterJoinWithPredicateContainsRexFieldAccess` in EnumerableJoinTest. I admit the rule based approach you proposed is also good for this issue. But I still think it's a little complicated, and it seems to increase the overhead of computation if we introduce a new projection. > Allow theta joins that have equi conditions to be executed using a hash join > algorithm > -------------------------------------------------------------------------------------- > > Key: CALCITE-2973 > URL: https://issues.apache.org/jira/browse/CALCITE-2973 > Project: Calcite > Issue Type: New Feature > Components: core > Affects Versions: 1.19.0 > Reporter: Lai Zhou > Priority: Minor > Labels: pull-request-available > Fix For: 1.20.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Now the EnumerableMergeJoinRule only supports an inner and equi join. > If users make a theta-join query for a large dataset (such as 10000*10000), > the nested-loop join process will take dozens of time than the sort-merge > join process . > So if we can apply merge-join or hash-join rule for a theta join, it will > improve the performance greatly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)