Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17008 )

Change subject: IMPALA-10463: Implement ds_theta_sketch() and 
ds_theat_estimate() functions
......................................................................


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/17008/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17008/2//COMMIT_MSG@7
PS2, Line 7: ds_theat_estimate
nit: typo


http://gerrit.cloudera.org:8080/#/c/17008/2//COMMIT_MSG@13
PS2, Line 13: ds_theat_estimate
nit: same typo


http://gerrit.cloudera.org:8080/#/c/17008/2//COMMIT_MSG@28
PS2, Line 28:    see IMPALA-10464.
I'd also include some highlights from that perf measurement doc into the commit 
msg. Probably an additional section would be great for this.


http://gerrit.cloudera.org:8080/#/c/17008/2/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17008/2/be/src/exprs/aggregate-functions-ir.cc@1646
PS2, Line 1646: SerializeCompactDsThetaSketch
In contrast with HLL as I see Theta doesn't compact the sketch just serializes 
it so this function name is not reflecting well what actually happens inside 
the function. Please rename it to SerializeDsThetaSketch()


http://gerrit.cloudera.org:8080/#/c/17008/2/be/src/exprs/aggregate-functions-ir.cc@1899
PS2, Line 1899:   datasketches::compact_theta_sketch* sketch_ptr =
I;m a bit lost here. Could you help me understand why is it needed to convert 
the union_sketch to a compact_theta_sketch? Can't you return the union_sketch?


http://gerrit.cloudera.org:8080/#/c/17008/2/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17008/2/be/src/exprs/datasketches-functions-ir.cc@110
PS2, Line 110: return 0;
HLL returns a null here. Have you checked the behaviour in Hive to be in sync 
with the 2 systems?


http://gerrit.cloudera.org:8080/#/c/17008/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
File 
testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test:

http://gerrit.cloudera.org:8080/#/c/17008/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@138
PS2, Line 138: # Check that ds_theta_estimate returns error for strings that 
are not serialized sketches.
Please add a test when ds_theta_estimate() is used on an HLL sketch. I guess we 
expect an error there.



--
To view, visit http://gerrit.cloudera.org:8080/17008
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I14f24c16b815eec75cf90bb92c8b8b0363dcbfbc
Gerrit-Change-Number: 17008
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu <chufu...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Feb 2021 15:13:30 +0000
Gerrit-HasComments: Yes

Reply via email to