Zach Amsden has posted comments on this change. Change subject: IMPALA-5547: Rework FK/PK join detection. ......................................................................
Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java File fe/src/main/java/org/apache/impala/planner/JoinNode.java: Line 91: protected List<List<EqJoinConjunctScanSlots>> fkPkEqJoinConjuncts_; Class member is only used in one function in the base class. Maybe document that this is preserved just for printing values and wrap it with a getter so it can't be modified? Line 253: return getFkPkJoinCardinality(eqJoinConjunctSlots, lhsCard, rhsCard); This computes estimates based on both PK conjuncts and non-PK conjuncts using the same formula, instead of filtering out non-PK joins. If we have a RHS which is an extremely selective non-PK, this could bias the cardinality estimate to a very low value, since we take the minimum computed cardinality from these assumed PK values. Is there any value in preserving the <List<List<>>> construction, or should getFkPkEqJoinConjuncts consider the conjuncts by tid order and just return a flattened list (which can be passed to getFkPkJoinCardinality) Line 275: for (List<EqJoinConjunctScanSlots> fkPkCandi: scanSlotsByTids.values()) { fkPkCandi - this could use a better name, it isn't clear to me what this is trying to describe. -- To view, visit http://gerrit.cloudera.org:8080/7257 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I49074fe743a28573cff541ef7dbd0edd88892067 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com> Gerrit-HasComments: Yes