Adam Tamas has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/16226 )
Change subject: IMPALA-9942: DataSketches HLL shouldn't take empty strings as distinct values ...................................................................... IMPALA-9942: DataSketches HLL shouldn't take empty strings as distinct values In Hive empty strings doesn't count as separate values when querying count(distinct) estimates using Apache DataSketches HLL algorithm on strings and varchars. For compatibility's sake Impala should not take it either. Tests: -added extra tests for hll with empty strings Change-Id: Ie7648217bbe2f66b817788f131c062f349b1e9ad --- M be/src/exprs/aggregate-functions-ir.cc M testdata/workloads/functional-query/queries/QueryTest/datasketches-hll.test 2 files changed, 26 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/16226/4 -- To view, visit http://gerrit.cloudera.org:8080/16226 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie7648217bbe2f66b817788f131c062f349b1e9ad Gerrit-Change-Number: 16226 Gerrit-PatchSet: 4 Gerrit-Owner: Adam Tamas <[email protected]> Gerrit-Reviewer: Adam Tamas <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
