Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12136 )

Change subject: IMPALA-8021: Add estimated cardinality to EXPLAIN output
......................................................................


Patch Set 12:

(6 comments)

Tim, thanks much for your review. I've addressed your comments.

Also, I went ahead and added another line: listing the partition pruning 
predicates in the HDFS scan node. It had been requested, and I found it 
necessary to understand the scan output cardinality.

Finally nailed all the affected tests. (We have issues with our E-to-E tests, 
but that is another story.) https://jenkins.impala.io/job/pre-review-test/265/

http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/main/java/org/apache/impala/planner/PlanNode.java@412
PS4, Line 412:
> I'd prefer appendExplainCardinality() or appendCardinality() - usually the
After reflecting on the layout, I decided to simplify it some, at the cost of 
an extra line for HDFS scans. While having somewhat shorter output is good, the 
shorter output did not, on reflection, justify the extra code complexity. The 
row-size, cardinality fields now appear on a separate line. Given that, the 
body of this function moved back into the above explain function where it 
originally resided.


http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/main/java/org/apache/impala/planner/PlanNode.java@413
PS4, Line 413:   /**
> I sort-of feel that row-size isn't necessary for STANDARD explain level - i
While I might have agreed originally, it does turn out that the planner makes 
its build/probe decision based on hash table size, not just cardinality. Hash 
table size is row-size * cardinality. So, to understand why a join plan ended 
up the way it did, we need to understand hash table size, which requires 
row-size. The other justification is that, since we add a line for cardinality, 
no harm in adding the row-size field to that line.


http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/test/java/org/apache/impala/testutil/TestUtils.java
File fe/src/test/java/org/apache/impala/testutil/TestUtils.java:

http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/test/java/org/apache/impala/testutil/TestUtils.java@157
PS4, Line 157:
> Something slightly non-obvious is that if all rules SKIP, then it's treated
Added a comment. The SKIP as MATCH issue is a nuisance. But, since this code 
seldom changes, and it is us that use it (not the customer), the advantages of 
simplicity seemed to outweigh the benefits of adding complexity to handle this 
obscure case.


http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/test/java/org/apache/impala/testutil/TestUtils.java@229
PS4, Line 229: match
> name seems misleading since it's the ratio, not the difference
Done


http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/test/java/org/apache/impala/testutil/TestUtils.java@230
PS4, Line 230:  =
> shouldn't this be 0.95?
Thanks! This is why we do code reviews...


http://gerrit.cloudera.org:8080/#/c/12136/4/fe/src/test/java/org/apache/impala/testutil/TestUtils.java@302
PS4, Line 302:         }
Just for reference, notice that the old code consumed three two per pass: once 
in the if(...contains...) line, a second on the if(...equals...) line.



--
To view, visit http://gerrit.cloudera.org:8080/12136
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9aa2d715b04cbb279aaffec8c5692686562d986
Gerrit-Change-Number: 12136
Gerrit-PatchSet: 12
Gerrit-Owner: Paul Rogers <prog...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <prog...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Fri, 04 Jan 2019 19:28:42 +0000
Gerrit-HasComments: Yes

Reply via email to