Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11565 )
Change subject: IMPALA-7659: Populate NULL count while computing column stats ...................................................................... Patch Set 7: Code-Review-1 (2 comments) Tim, you raised perfectly valid concerns. I'll wait for https://gerrit.cloudera.org/#/c/11920/ to be merged since it adds the necessary unit-test infrastructure. Please find the perf test result in the comment. (Parking it as -1 while the other GVO is running). http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG@16 PS7, Line 16: Tests: Updated the affected tests to include the null counts. > FWIW, IMPALA-7842 provides a starter set of cardinality tests based on expo Tim, I agree we need a unit test for this. Thanks for pointing it out. I'll wait for IMPALA-7842 to go in, rebase on top of it and add it there since it provides the necessary plumbing. http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@251 PS6, Line 251: > IMPALA-1003 doesn't include many breadcrumbs. I do wonder if there were oth I tried this on the store_sales table from 1TB tpcds/parquet dataset. The table has 22 non-partitioned columns and ~2.8B rows. I ran the child query with and without null count and I noticed 7-8% slowdown. (I warmed up the cluster by running the queries until their runtime stabilized). ~64s without null count / ~70s with null count. I don't see why "refresh" would be affected? (unless I'm missing something) -- To view, visit http://gerrit.cloudera.org:8080/11565 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic68f8b4c3756eb1980ce299a602a7d56db1e507a Gerrit-Change-Number: 11565 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <piotr.findei...@gmail.com> Gerrit-Reviewer: Anonymous Coward <piotr.findei...@gmail.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Dec 2018 04:03:25 +0000 Gerrit-HasComments: Yes