[ https://issues.apache.org/jira/browse/IGNITE-16493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494024#comment-17494024 ]
Aleksey Plekhanov commented on IGNITE-16493: -------------------------------------------- Looks like there are some problems in decorrelation procedure in Calcite, for example, query: {noformat} CREATE TABLE integers(i INTEGER) WITH TEMPLATE=REPLICATED; INSERT INTO integers VALUES (1), (2), (3), (NULL); SELECT i, EXISTS(SELECT i FROM integers WHERE i IS NULL OR i>i1.i*10) FROM integers i1;{noformat} Produces plan like: {noformat} IgniteProject(I=[$2], EXPR$1=[IS NOT NULL($5)]): rowcount = 4.15, cumulative cost = IgniteCost [rowCount=37.15, cpu=73.15, memory=28.0, io=0.0, network=0.0], id = 3318 IgniteNestedLoopJoin(condition=[=($3, $4)], joinType=[left], variablesSet=[[]]): rowcount = 4.15, cumulative cost = IgniteCost [rowCount=33.0, cpu=69.0, memory=28.0, io=0.0, network=0.0], id = 3317 IgniteTableScan(table=[[PUBLIC, INTEGERS]], projects=[[$t0, $t1, $t2, *($t2, 10)]], requiredColumns=[{0, 1, 2}]): rowcount = 4.0, cumulative cost = IgniteCost [rowCount=4.0, cpu=4.0, memory=0.0, io=0.0, network=0.0], id = 1048 IgniteProject($f3=[$0], $f1=[true]): rowcount = 1.0, cumulative cost = IgniteCost [rowCount=25.0, cpu=49.0, memory=20.0, io=0.0, network=0.0], id = 3316 IgniteSingleHashAggregate(group=[{0}]): rowcount = 1.0, cumulative cost = IgniteCost [rowCount=24.0, cpu=48.0, memory=20.0, io=0.0, network=0.0], id = 3315 IgniteProject($f3=[$3]): rowcount = 2.0, cumulative cost = IgniteCost [rowCount=22.0, cpu=46.0, memory=16.0, io=0.0, network=0.0], id = 3314 IgniteNestedLoopJoin(condition=[OR(IS NULL($2), >($2, $3))], joinType=[inner], variablesSet=[[]]): rowcount = 2.0, cumulative cost = IgniteCost [rowCount=20.0, cpu=44.0, memory=16.0, io=0.0, network=0.0], id = 3313 IgniteTableScan(table=[[PUBLIC, INTEGERS]]): rowcount = 4.0, cumulative cost = IgniteCost [rowCount=4.0, cpu=4.0, memory=0.0, io=0.0, network=0.0], id = 269 IgniteSingleHashAggregate(group=[{0}]): rowcount = 2.0, cumulative cost = IgniteCost [rowCount=8.0, cpu=8.0, memory=8.0, io=0.0, network=0.0], id = 3312 IgniteTableScan(table=[[PUBLIC, INTEGERS]], projects=[[*($t0, 10)]], requiredColumns=[{2}]): rowcount = 4.0, cumulative cost = IgniteCost [rowCount=4.0, cpu=4.0, memory=0.0, io=0.0, network=0.0], id = 282{noformat} In this plan {{IgniteNestedLoopJoin(condition=[=($3, $4)], joinType=[left])}} filter out {{null}} values derived from the right hand and return wrong result (\{{(null, false)}} instead of {{{}(null, true){}}}) > Calcite engine. Decorrelation after subquery rewrite > ---------------------------------------------------- > > Key: IGNITE-16493 > URL: https://issues.apache.org/jira/browse/IGNITE-16493 > Project: Ignite > Issue Type: Improvement > Reporter: Aleksey Plekhanov > Priority: Major > Labels: calcite, calcite2-required, calcite3-required > > Currently decorrelation is performed in {{SqlToRelConverter}}, but after this > {{PlannerPhase.HEP_DECORRELATE}} planning phase is executed (which actually > rewrites subqueries into correlates, but doesn't perform decorrelation). In > some cases, other types of join can cost less than correlated nested loop > join, so we can try to decorrelate the query plan again after > {{PlannerPhase.HEP_DECORRELATE}} phase (call > {{RelDecorrelator.decorrelateQuery}}). > {{PlannerPhase.HEP_DECORRELATE}} should be also renamed to something like > "HEP_REWRITE_SUBQUERY". -- This message was sent by Atlassian Jira (v8.20.1#820001)