[ https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818940#comment-16818940 ]
Stamatis Zampetakis commented on CALCITE-2973: ---------------------------------------------- In terms of code re-use, it would seem more natural to treat only the equality condition part in the join and leave the remaining condition to be treated afterwards. As Julian mentioned when there are outer joins involved, the filter cannot be applied after the join but I have the impression that a projection could achieve the same result (i.e., nullify the left/right part when a certain condition holds). The additional benefit is that if we could break a theta join into an equijoin plus filter/projection (using a rule) this could be exploited by more users. In terms of semantics, having the join operator do all the job is more intuitive and the plan is easier to understand so in the end I haven't made up my mind what is the best approach. > Allow theta joins that have equi conditions to be executed using a hash join > algorithm > -------------------------------------------------------------------------------------- > > Key: CALCITE-2973 > URL: https://issues.apache.org/jira/browse/CALCITE-2973 > Project: Calcite > Issue Type: New Feature > Components: core > Affects Versions: 1.19.0 > Reporter: Lai Zhou > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now the EnumerableMergeJoinRule only supports an inner and equi join. > If users make a theta-join query for a large dataset (such as 10000*10000), > the nested-loop join process will take dozens of time than the sort-merge > join process . > So if we can apply merge-join or hash-join rule for a theta join, it will > improve the performance greatly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)