Dimitris Tsirogiannis has posted comments on this change.

Change subject: IMPALA-5547: Rework FK/PK join detection.
......................................................................


Patch Set 1:

(17 comments)

I have one comment about testing. We have lots of test changes but it's not 
easy to verify if the end result makes sense or not. I think it may worth 
having a planner test with explain_level > 2 that covers a few interesting join 
cases so that we can see what kind of pk/fk conjuncts are detected and what the 
impact is on estimated join cardinalities.

http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
File fe/src/main/java/org/apache/impala/planner/HashJoinNode.java:

PS1, Line 170: else if (fkPkEqJoinConjuncts_.isEmpty()) {
             :           output.append("assumed fk/pk");
I haven't read the entire patch yet, so I am curious what this case represents.


PS1, Line 178: for (int j = 0; j < slots.size(); ++j) {
             :               output.append(slots.get(j).toString());
             :               if (j + 1 != slots.size()) output.append(", ");
             :             }
Can't you use the Guava's Joiner class here? Joiner.on(",").join(slots)


http://gerrit.cloudera.org:8080/#/c/7257/1/fe/src/main/java/org/apache/impala/planner/JoinNode.java
File fe/src/main/java/org/apache/impala/planner/JoinNode.java:

PS1, Line 37: import 
org.apache.kudu.client.shaded.com.google.common.collect.Maps;
?


PS1, Line 271: scanSlotsByTids
nit: Each pair represents two joined tables, no? So, maybe rename this to 
'scanSlotsByJoinedTids'? A bit more explicit on what this thing represents.


PS1, Line 281: numRows
rhsTableCardinality?


Line 290: 
nit: extra line


PS1, Line 294: conjuncts
conjunct slots


PS1, Line 300:     Preconditions.checkState(joinOp_.isInnerJoin() || 
joinOp_.isOuterJoin());
This private function is only called in L253? I think you can remove this check.


PS1, Line 312: lhsNdv
Can this ever be 0?


PS1, Line 313: rhsNumRows
Same here, can this be 0?


PS1, Line 318: result = Math.min(result, joinCard);
That part I think is very important and I am not sure it is explained anywhere. 
Maybe expand the comment in L209?


PS1, Line 329: conjuncts
conjunct slots


PS1, Line 341: slots.lhs().getStats().getNumDistinctValues();
nit: these are used in couple of places. Since you already have the 
EqJoinConjunctScanSlots class, why don't your wrap them in some nice util 
functions there (e.g. getLhsNdvs() or something like that).


PS1, Line 343: slots.lhs().getParent().getTable().getNumRows();
same comment as above.


PS1, Line 373: Preconditions
I don't think these checks are useful. This private constructor is only called 
through the static function that has tight control on what gets passed in this 
ctor.


PS1, Line 382: .
nit: remove


PS1, Line 399: tupleDesc
nit: inline


-- 
To view, visit http://gerrit.cloudera.org:8080/7257
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I49074fe743a28573cff541ef7dbd0edd88892067
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zams...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to