Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/11528 )
Change subject: Use NDV=1 for a Column with all nulls ...................................................................... Patch Set 10: (3 comments) Thanks for the continued reviews. Cleaned up a few more items. Answered some questions. http://gerrit.cloudera.org:8080/#/c/11528/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11528/8//COMMIT_MSG@31 PS8, Line 31: > what types of changes are these-- did they manage to change the structure o Sorry. The note is trying to explain that 1) there is a planner change (that is the gist of the problem resolution), and 2) the change occurs only in the very narrow conditions spelled out in this note. The line in question here explains why the adjustment is applied only to NDV of 0-1 (and not, say, 0,-10). If we have NDV=10, and column is nullable, then we certainly might want to adjust NDV to 11. But, if we do, many tests fail due to changed cardinality, changed memory and so on. (The tests "fail" in the sense that the text plans have changed. The structure is the same, just the numbers differ.) The original bug complained that, with NDV=0, the planner did misfire, did produce an incorrect plan. It did so because, in aggregation, we multiply cardinalities by 0, resulting in an overall cardinality of 0. The new tests verify that, after the adjustments, cardinality estimates are no longer zeroed out in aggregate nodes, but instead bubble up. This will, presumably, fix the original problem. (The original bug did not give sufficient information to completely reproduce the particular query that failed.) http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/NullTable/large_data.csv File testdata/NullTable/large_data.csv: http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/NullTable/large_data.csv@1 PS8, Line 1: > Looks like theres several conventions under testdata... I see testdata/data Created a new NullRows folder, moved this file there and renamed it to "data.csv". http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/datasets/functional/functional_schema_template.sql@1355 PS8, Line 1355: ==== > I have no idea, but I counted BASE_TABLE_NAME and see it shows up 105 times Removed the Kudu portion. -- To view, visit http://gerrit.cloudera.org:8080/11528 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ife657a43c9cafc451bd12ddf857dcb7169e97459 Gerrit-Change-Number: 11528 Gerrit-PatchSet: 10 Gerrit-Owner: Paul Rogers <par0...@yahoo.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com> Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Tue, 02 Oct 2018 19:53:05 +0000 Gerrit-HasComments: Yes