[jira] [Commented] (CALCITE-2979) Add a block-based nested loop join algorithm

Danny Chan (JIRA) Fri, 24 May 2019 23:17:26 -0700


    [ 
https://issues.apache.org/jira/browse/CALCITE-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16848051#comment-16848051
 ]


Danny Chan commented on CALCITE-2979:
-------------------------------------

The runtime filter kind of has the same purpose to promote join efficiency. [1] 
While this issue aims to reduce the scan times. Runtime filter aims to reduce 
the total data that needed scan. Do we need to support runtime filters for hash 
join in Calcite ?

[1] 
[https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]

> Add a block-based nested loop join algorithm
> --------------------------------------------
>
>                 Key: CALCITE-2979
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2979
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.19.0
>            Reporter: Stamatis Zampetakis
>            Assignee: Khawla Mouhoubi
>            Priority: Major
>              Labels: performance
>
> Currently, Calcite provides a tuple-based nested loop join algorithm 
> implemented through EnumerableCorrelate and EnumerableDefaults.correlateJoin. 
> This means that for each tuple of the outer relation we probe (set variables) 
> in the inner relation.
> The goal of this issue is to add new algorithm (or extend the correlateJoin 
> method) which first gathers blocks (batches) of tuples from the outer 
> relation and then probes the inner relation once per block.
> There are cases (eg., indexes) where the inner relation can be accessed by 
> more than one value which can greatly improve the performance in particular 
> when the outer relation is big.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2979) Add a block-based nested loop join algorithm

Reply via email to