Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16226 )
Change subject: IMPALA-9942: DataSketches HLL shouldn't take empty strings as distinct values ...................................................................... Patch Set 4: (6 comments) I have some comments related to your tests but other than that, I'm fine with this patch. Thanks! http://gerrit.cloudera.org:8080/#/c/16226/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test: http://gerrit.cloudera.org:8080/#/c/16226/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@171 PS2, Line 171: # as if the whole data was sketched together into > This test adds around 5 seconds to the whole run (around 20% increase), but Thanks for the explanation. Let's keep it then. http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test: http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@145 PS3, Line 145: select ds_hll_estimate(b), ds_hll_estimate(c) from functional_parquet.nulltable; Please adjust the comment above that this now covers full empty string inputs as well. http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@171 PS3, Line 171: hole data w To follow the naming convention: empty_string http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@172 PS3, Line 172: I'd move this also to a new line. http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@172 PS3, Line 172: select : ds_hll_estimate(ds_hll_union(date_sketch)), : ds_hll_estimate(ds_hll_union(float_sketch)) : from sketch_store; : ---- TYPES Please adjust the indentation here (after a linebreak the following lines should be indented for more readability) http://gerrit.cloudera.org:8080/#/c/16226/3/testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test@177 PS3, Line 177: nit: by empty value you mean empty string, right? -- To view, visit http://gerrit.cloudera.org:8080/16226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie7648217bbe2f66b817788f131c062f349b1e9ad Gerrit-Change-Number: 16226 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas <[email protected]> Gerrit-Reviewer: Adam Tamas <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Tue, 28 Jul 2020 12:21:47 +0000 Gerrit-HasComments: Yes
