Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16098 )

Change subject: IMPALA-9744: Treat corrupt table stats as missing to avoid bad 
plans
......................................................................


Patch Set 22:

(2 comments)

Thanks for the review comments.

http://gerrit.cloudera.org:8080/#/c/16098/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16098/14/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@1201
PS14, Line 1201: n
> Okay - and just to confirm, for partitioned tables, tbl_.getNumRows() == th
That table RC remains at -1 if the stats is computed via Hive's version of 
update stats command, in all following arrangements.
1. Two partitions: one missing and one corrupt stats;
2. Two partitions: one missing and one corrupt stats, followed by hive compute 
stats; and then adding two more partitions

Only when the stats is updated with Impala's version, then the table RC is 
updated properly. If additional partitions are added, the table RC remains 
unchanged. I believe this is the problem you mentioned to me a while back.

See screenshots from Slack.


http://gerrit.cloudera.org:8080/#/c/16098/17/tests/metadata/test_compute_stats.py
File tests/metadata/test_compute_stats.py:

http://gerrit.cloudera.org:8080/#/c/16098/17/tests/metadata/test_compute_stats.py@174
PS17, Line 174:
> you can just add a partition, run compute stats on the table, and then load
Please refer to my other comments to your 1st question.

I white-box tested the mixed case of two good partitions (via Impala compute 
stats) and two missing/corrupt partitions and verified that the code works fine.

I wonder if there is a need to add the mixed case into this test, which 
requires a dedicated method/more complexity.



--
To view, visit http://gerrit.cloudera.org:8080/16098
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9f4c64616ff7c0b6d5a48f2b5331325feeff3576
Gerrit-Change-Number: 16098
Gerrit-PatchSet: 22
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 23 Jul 2020 16:31:59 +0000
Gerrit-HasComments: Yes

Reply via email to