Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2373: Extrapolate row counts for HDFS tables. ......................................................................
Patch Set 4: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/6840/4/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: PS4, Line 624: Otherwise, the input cardinality is based on the per-partition row count stats : * and/or the table-level row count stats, depending on which of those are available. : * Partitions without stats are ignored. That part describes the logic in computeCardinalities. Sure you want to leave it here? http://gerrit.cloudera.org:8080/#/c/6840/3/tests/metadata/test_explain.py File tests/metadata/test_explain.py: PS3, Line 127: 50 > This specific test case may seem clear cut, but I don't think the general i There are definitely ways to improve this. If you haven't done so, can you file a JIRA with the improvements we've talked about so far? e.g. using exact stats when available with some change detection mechanism, etc. -- To view, visit http://gerrit.cloudera.org:8080/6840 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I972c8a03ed70211734631a7dc9085cb33622ebc4 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-HasComments: Yes