Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11565 )

Change subject: IMPALA-7659: Populate NULL count while computing column stats
......................................................................


Patch Set 7: Code-Review-1

(2 comments)

Tim, you raised perfectly valid concerns. I'll wait for 
https://gerrit.cloudera.org/#/c/11920/ to be merged since it adds the necessary 
unit-test infrastructure.  Please find the perf test result in the comment. 
(Parking it as -1 while the other GVO is running).

http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG@16
PS7, Line 16: Tests: Updated the affected tests to include the null counts.
> FWIW, IMPALA-7842 provides a starter set of cardinality tests based on expo
Tim, I agree we need a unit test for this. Thanks for pointing it out. I'll 
wait for IMPALA-7842 to go in, rebase on top of it and add it there since it 
provides the necessary plumbing.


http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@251
PS6, Line 251:
> IMPALA-1003 doesn't include many breadcrumbs. I do wonder if there were oth
I tried this on the store_sales table from 1TB tpcds/parquet dataset. The table 
has 22 non-partitioned columns and ~2.8B rows.

I ran the child query with and without null count and I noticed 7-8% slowdown. 
(I warmed up the cluster by running the queries until their runtime 
stabilized). ~64s without null count / ~70s with null count.

I don't see why "refresh" would be affected? (unless I'm missing something)



--
To view, visit http://gerrit.cloudera.org:8080/11565
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic68f8b4c3756eb1980ce299a602a7d56db1e507a
Gerrit-Change-Number: 11565
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <piotr.findei...@gmail.com>
Gerrit-Reviewer: Anonymous Coward <piotr.findei...@gmail.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Dec 2018 04:03:25 +0000
Gerrit-HasComments: Yes

Reply via email to