[ 
https://issues.apache.org/jira/browse/IGNITE-21277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maksim Zhuravkov reassigned IGNITE-21277:
-----------------------------------------

    Assignee: Maksim Zhuravkov

> Sql. Partition pruning. Port partitionExtractor from AI2.
> ---------------------------------------------------------
>
>                 Key: IGNITE-21277
>                 URL: https://issues.apache.org/jira/browse/IGNITE-21277
>             Project: Ignite
>          Issue Type: Improvement
>          Components: sql
>    Affects Versions: 3.0.0-beta2
>            Reporter: Maksim Zhuravkov
>            Assignee: Maksim Zhuravkov
>            Priority: Major
>              Labels: ignite-3
>
> In order to prune unnecessary partitions we need to obtain information that 
> includes possible "values" of colocation key columns from filter expressions 
> for every scan operators prior to statement execution. This can be 
> accomplished by traversing an expression tree of scan's filter and collecting 
> expressions with colocation key columns (this data is called partition 
> pruning metadata for simplicity).
> 1. Implement a component that takes a physical plan and analyses filter 
> expressions of every scan operator and creates (if possible) an expression 
> that includes all colocated columns. (The PartitionExtractor from patch 
> (https://github.com/apache/ignite/pull/10928/files) can be used a reference 
> implementation).
> Expression types to analyze:
>  * AND
>  * EQUALS
>  * IS_FALSE
>  * IS_NOT_DISTINCT_FROM
>  * IS_NOT_FALSE
>  * IS_NOT_TRUE
>  * IS_TRUE
>  * NOT
>  * OR
>  * SEARCH (operation that tests whether a value is included in a certain 
> range)
> 2. Update QueryPlan to include partition pruning metadata for every scan 
> operator (source_id = <partition pruning metadata>).
> Basic examples:
> {code:java}
> Statement: 
> SELECT * FROM t WHERE pk = 7 OR pk = 42
> Partition metadata: 
> t's source_id = [pk=7 || pk = 42] // Assuming colocation key is equal to 
> primary key, || denotes OR operation 
> Statement: 
> SELECT * FROM t WHERE pk = 7 OR col1 = 1
> Partition metadata: [] // Empty, because col1 is not part of a colocation key.
> Statement: 
> SELECT * FROM t_colo_key1_colo_key2 WHERE colo_key1= 42
> Partition metadata: [] // Empty, because colo_key2 is missing 
> {code}
> —
> *Additional examples - partition pruning is possible*
> Dynamic parameters:
> {code:java}
> SELECT * FROM t WHERE pk = ?1 
> Partition pruning metadata: t = [ pk = ?1 ]
> {code}
> Colocation columns reside inside a nested expression:
> {code:java}
> SELECT * FROM t WHERE col1 = col2 AND (col2 = 100 AND pk = 2) 
> Partition pruning metadata: t = [ pk = 2 ]
> {code}
> Multiple keys:
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 AND col_c2 = 2 
> Partition pruning metadata:  t = [ (col_c1 = 1, col_c2 = 2) ]
> {code}
> Complex expression with multiple keys:
> {code:java}
> SELECT * FROM t WHERE (col_col1 = 100 and col_col2 = 4) OR (col_col1 = 4 and 
> col_col2 = 100)
> Partition pruning metadata: t = [ (col_col1 = 100, col_col2 = 4) || (col_col1 
> = 4, col_col2 = 100) ]
> {code}
> Multiple tables, assuming that filter b_id = 42 is pushed into scan b, 
> because a_id = b_id:
> {code:java}
> SELECT * FROM a JOIN b WHERE a_id = b_id AND a_id = 42 
> Partition pruning metadata: a= [ a_id=42 ], b=[ b_id=42 ]
> {code}
> ---
> *Additional examples - partition pruning is not possible*
> Columns named col* are not part of colocation key:
> {code:java}
> SELECT * FROM t WHERE col1 = 10 
> // Filter does not use colocation key columns.
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col1 = col2 OR pk = 42 
> // We need to scan all partitions to figure out which tuples have ‘col1 = 
> col2’
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = 10 AND col_col2 OR col_col1 = 42
> // Although first expression uses all colocation key columns the second one 
> only uses some.
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_c1 = 1 OR col_c2 = 2 
> // We need to scan all partitions to figure out which tuples have col_c1 = 1 
> OR col_c2 = 2.
> Partition pruning metadata: [] 
> {code}
> {code:java}
> SELECT * FROM t WHERE col_col1 = col_col2 OR col_col2 = 42 
> // We need to scan all partitions to figure out which tuples have ‘col_col1 = 
> col_col2’
> Partition pruning metadata: [] 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to