[jira] [Commented] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

Lai Zhou (JIRA) Tue, 02 Apr 2019 20:42:15 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16808324#comment-16808324
 ]


Lai Zhou commented on CALCITE-2973:
-----------------------------------

[~julianhyde],[~zabetak] , good idea.

I just create a new rule for my application, to avoid changing the  
calcite-core.

I'll make a PR later to  allow theta joins to be executed using a merge join or 
hash join.

I draw a table to describe the relationship of join types and join operators：

 
|| ||inner||non-inner||
|*only equi condition*|EnumerableJoin|EnumerableJoin|
|*only*  *non-equi  condition*** ** |EnumerableJoin|EnumerableThetaJoin|
|*mixed equi and non-equi  condition*|EnumerableJoin+EnumerableFilter
or
EnumerableMergeJoin(changed)
 
|EnumerableThetaJoin
or
 EnumerableMergeJoin
or
EnumerableHashJoin|

If a join is non-inner and has  ** equi and non-equi  condition meanwhile, we 
have 3 choice to plan it.

Now  EnumerableThetaJoin  and EnumerableMergeJoin have a corresponding rule 
respectively， 

What do you think if I introduce a  new rule( EnumerableThetaHashJoinRule) to 
allow theta joins  to be executed using a hash join？

 

 

> Allow theta joins to be executed using a merge join algorithm
> -------------------------------------------------------------
>
>                 Key: CALCITE-2973
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2973
>             Project: Calcite
>          Issue Type: New Feature
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Lai Zhou
>            Priority: Minor
>
> Now the EnumerableMergeJoinRule only supports an inner and equi join.
> If users make a theta-join query  for a large dataset (such as 10000*10000), 
> the nested-loop join process will take dozens of time than the sort-merge 
> join process .
> So if we can apply merge-join or hash-join rule for a theta join, it will 
> improve the performance greatly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2973) Allow theta joins to be executed using a merge join algorithm

Reply via email to