[ https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16815943#comment-16815943 ]
Lai Zhou edited comment on CALCITE-2973 at 4/18/19 3:04 AM: ------------------------------------------------------------ [~julianhyde],[~zabetak],[~hyuan] I made a PR to improve the EnumerableJoin. Since EnumerableMergeJoin is never taken ,I change the summary to "Allow theta joins that have equi conditions to be executed using a hash join algorithm." Now a join rel node will be converted to an EnumerableJoin if it has mixed equi and non-equi conditions. see [EnumerableJoinRule.java#L62|https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoinRule.java#L62] Now EnumerableJoin can handle a per-row condition, I introduce a the remainCondition to generate the predicate for the join. see [EnumerableJoin.java#L250|https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoin.java#L250] I also introduce a new algorithm to support join with predicate. see [EnumerableDefaults.java#L1061|https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1061] was (Author: hhlai1990): [~julianhyde],[~zabetak],[~hyuan] I made a PR to improve the EnumerableJoin. Since EnumerableMergeJoin is never taken ,I change the summary to "Allow theta joins that have equi conditions to be executed using a hash join algorithm." Now a join rel node will be converted to an EnumerableJoin if it has mixed equi and non-equi conditions. see [https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoinRule.java#L62|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoinRule.java#L62] Now EnumerableJoin can handle a per-row condition, I introduce a the remainCondition to generate the predicate for the join. see [https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoin.java#L250|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableJoin.java#L250] I also introduce a new algorithm to support join with predicate. see [https://github.com/apache/calcite/blob/16098ab6ff68797b4eaad90718dcae8e83047e2b/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1061|https://github.com/apache/calcite/blob/2251c82f209612d8ae31e2e7a42acdb2bcb15d55/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L1061] > Allow theta joins that have equi conditions to be executed using a hash join > algorithm > -------------------------------------------------------------------------------------- > > Key: CALCITE-2973 > URL: https://issues.apache.org/jira/browse/CALCITE-2973 > Project: Calcite > Issue Type: New Feature > Components: core > Affects Versions: 1.19.0 > Reporter: Lai Zhou > Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Now the EnumerableMergeJoinRule only supports an inner and equi join. > If users make a theta-join query for a large dataset (such as 10000*10000), > the nested-loop join process will take dozens of time than the sort-merge > join process . > So if we can apply merge-join or hash-join rule for a theta join, it will > improve the performance greatly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)