[ https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808324#comment-16808324 ]
Lai Zhou commented on CALCITE-2973: ----------------------------------- [~julianhyde],[~zabetak] , good idea. I just create a new rule for my application, to avoid changing the calcite-core. I'll make a PR later to allow theta joins to be executed using a merge join or hash join. I draw a table to describe the relationship of join types and join operators: || ||inner||non-inner|| |*only equi condition*|EnumerableJoin|EnumerableJoin| |*only* *non-equi condition*** ** |EnumerableJoin|EnumerableThetaJoin| |*mixed equi and non-equi condition*|EnumerableJoin+EnumerableFilter or EnumerableMergeJoin(changed) |EnumerableThetaJoin or EnumerableMergeJoin or EnumerableHashJoin| If a join is non-inner and has ** equi and non-equi condition meanwhile, we have 3 choice to plan it. Now EnumerableThetaJoin and EnumerableMergeJoin have a corresponding rule respectively, What do you think if I introduce a new rule( EnumerableThetaHashJoinRule) to allow theta joins to be executed using a hash join? > Allow theta joins to be executed using a merge join algorithm > ------------------------------------------------------------- > > Key: CALCITE-2973 > URL: https://issues.apache.org/jira/browse/CALCITE-2973 > Project: Calcite > Issue Type: New Feature > Components: core > Affects Versions: 1.19.0 > Reporter: Lai Zhou > Priority: Minor > > Now the EnumerableMergeJoinRule only supports an inner and equi join. > If users make a theta-join query for a large dataset (such as 10000*10000), > the nested-loop join process will take dozens of time than the sort-merge > join process . > So if we can apply merge-join or hash-join rule for a theta join, it will > improve the performance greatly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)