Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/12939 to look at the new patch set (#4). Change subject: IMPALA-8386: Fix incorrect equivalence conjuncts not treated as identity ...................................................................... IMPALA-8386: Fix incorrect equivalence conjuncts not treated as identity When generating single node plans for inline views, Impala will create some equivalence conjuncts based on slot equivalences. However, these conjuncts may finally be substituted to identity (e.g. a = a) which may incorrectly reject rows with nulls. We already have some logic to remove this kind of conjuncts but the existing checks have exceptions. For example, consider the following tables and a query: table A table B table C +------+ +------+--------+ +------+------+ | a_id | | b_id | amount | | a_id | b_id | +------+ +------+--------+ +------+------+ | 1 | | 1 | 10 | | 1 | 1 | | 2 | | 1 | 20 | | 2 | 2 | +------+ | 2 | NULL | +------+------+ +------+--------+ select * from (select t2.a_id, t2.amount1, t2.amount2 from a left outer join ( select c.a_id, amount as amount1, amount as amount2 from b join c on b.b_id = c.b_id ) t2 on a.a_id = t2.a_id ) t1; They query has 11 slots. The valueTransferGraph (slot equivalences) has 3 strongly connected components: * {slot0 (b.b_id), slot1 (c.b_id)} * {slot2 (c.a_id), slot4 (t2.a_id), slot8 (t1.a_id)} * {slot3 (b.amount), slot5 (t2.amount1), slot6 (t2.amount2), slot9 (t1.amount1), slot10 (t1.amount2)} In SingleNodePlanner#migrateConjunctsToInlineView, when dealing with inline view t1, a predicate "t1.amount1 = t1.amount2" will first be created by Analyzer#createEquivConjuncts, then be substituted using the smap_ of the inline view and become "t2.amount1 = t2.amount2". It can still pass the IdentityPredicate check. However, the substituted one will finally be resolved to "amount = amount" and be assigned to the left outer join node. So nulls are incorrectly rejected. Actually, when checking IdentityPredicates, we need to check the final resolved version of them using base table slots (baseTblSmap_). So the predicate "t1.amount1 = t1.amount2" will be resolved to "amount = amount" and won't pass the IdentityPredicate check. Tests: * Add plan tests in PlannerTest/inline-view.test * Run all tests locally in CORE exploration strategy Change-Id: Ia87aa9db2de85f0716e4854a88727aad593773fa --- M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test 3 files changed, 221 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/12939/4 -- To view, visit http://gerrit.cloudera.org:8080/12939 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia87aa9db2de85f0716e4854a88727aad593773fa Gerrit-Change-Number: 12939 Gerrit-PatchSet: 4 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>