Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18413 )

Change subject: IMPALA-11205: Implement CORR() function
......................................................................


Patch Set 7:

(16 comments)

The patch is looking good now!

http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@291
PS7, Line 291: //
nit: add a space after "//"


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@320
PS7, Line 320: if
nit: need a space after "if"


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@321
PS7, Line 321:     state->sumx += src1.val;
             :     state->sumy += src2.val;
             :     state->sum_squaredx += src1.val * src1.val;
             :     state->sum_squaredy += src2.val * src2.val;
             :     state->prod += src1.val * src2.val;
             :     ++state->count;
We can extract these into an update method, e.g.

  static inline void CorrUpdate(double x, double y, CorrState* state)

and then use it in TimestampCorrUpdate() as well.


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@336
PS7, Line 336: if
nit: need a space after "if"


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@337
PS7, Line 337:     state->sumx -= src1.val;
             :     state->sumy -= src2.val;
             :     state->sum_squaredx -= src1.val * src1.val;
             :     state->sum_squaredy -= src2.val * src2.val;
             :     state->prod -= src1.val * src2.val;
             :     --state->count;
We can extract these into a method too.


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@355
PS7, Line 355: if
nit: need a space after "if"


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@376
PS7, Line 376: if
nit: need a space after if


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@401
PS7, Line 401:   ctx->Free(src.ptr);
We should not free it here since we still use 'state' at following lines. I 
think we should free it before each 'return' statement.

Another solution is getting its values into local variables and then free it.


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions-ir.cc@408
PS7, Line 408: if(
nit: add a space after "if"


http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions.h
File be/src/exprs/aggregate-functions.h:

http://gerrit.cloudera.org:8080/#/c/18413/7/be/src/exprs/aggregate-functions.h@74
PS7, Line 74:   static const StringVal CorrSerialize(FunctionContext* ctx, 
const StringVal& src);
We can remove this


http://gerrit.cloudera.org:8080/#/c/18413/7/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/18413/7/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@1322
PS7, Line 1322: Corr
nit: Add a space after "//"


http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test
File testdata/workloads/functional-query/queries/QueryTest/uda.test:

http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test@10
PS7, Line 10: select corr(ps_availqty,ps_supplycost) from tpch.partsupp;
Sorry to be back and forth, but we'd better move these tests to 
testdata/workloads/functional-query/queries/QueryTest/aggregation.test
All the builtin aggregate functions are tested there.


http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test@24
PS7, Line 24: select corr(d,e) from functional.nulltable;
Do we have test coverage on only one side is NULL?


http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test@41
PS7, Line 41: select corr(utctime,localtime) from functional.alltimezones;
Cool!


http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test@48
PS7, Line 48: select corr(int_col, int_col)
Could you also add the 'id' column here?

Also please comment why the results are NULLs.


http://gerrit.cloudera.org:8080/#/c/18413/7/testdata/workloads/functional-query/queries/QueryTest/uda.test@69
PS7, Line 69: select corr(double_col,double_col)
Could you add the 'year' column here?



--
To view, visit http://gerrit.cloudera.org:8080/18413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06
Gerrit-Change-Number: 18413
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <pranav.lo...@cloudera.com>
Gerrit-Reviewer: Anonymous Coward <pranav.lo...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Comment-Date: Tue, 26 Apr 2022 07:51:17 +0000
Gerrit-HasComments: Yes

Reply via email to