Tim Armstrong has posted comments on this change.

Change subject: IMPALA-1430,IMPALA-4108: codegen all builtin aggregate functions
......................................................................


Patch Set 9:

(8 comments)

Rebased to pick up IMPALA-3884

http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

PS7, Line 459: 
> that
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

Line 184:   /// lowered type. This *Val should be non-null. The output variable 
is called 'name'.
> nit: space
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS7, Line 347: 
             :   /// TODO: Return Status inst
> Avoid cloning if possible to reduce compilation time.
Done


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS7, Line 1660: ond, the
              :     // value of the slot was initialized in the right way in 
InitAggSlots() (e.g. 0 for
              :     // SUM) that we get the right result if UpdateSlot() 
pretends that the NULL bit of
              :     // 'dst' is unse
> Please also see comments above.
I tried to clean this up a bit. My intention was to document the details of the 
initialisation in InitAggSlot() and provide a pointer here.


PS7, Line 1675: dst.SetIsNull(slot_desc->CodegenIsNull(codegen, &builder, 
agg_tuple_arg));
> Why is this not in the old code or was it done somewhere else ?
It wasn't in the old code - the old code didn't handle arbitrary UDAs, just a 
subset of the builtins. E.g. the old code would have given wrong results if we 
changed SUM() so that it returned NULL on overflow.


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

PS7, Line 649: 'dst_val' should contain the previous value of the aggregate
             :   /// function, and 'updated_dst_val' is set to the new value 
after the Update or Merge
             :   /// operation is applied.
> Just curious why we separate 'dst_val' and 'result_val'. Doesn't 'result_va
We need two different CodegenAnyVal, since it really represents the value 
(which changes) rather than the memory location. We could use an in-out 
argument instead of separate input and output arguments but this seemed simpler 
to me.

I renamed 'result_val' to 'updated_dst_val' to hopefully make the connection 
clearer (the naming was confusing).


http://gerrit.cloudera.org:8080/#/c/4655/7/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

PS7, Line 496:  false
> Please feel free to ignore but you can consider setting this second argumen
It seems clearer to make it explicit, given it's not obvious what the "default" 
behaviour should be.


http://gerrit.cloudera.org:8080/#/c/4655/7/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

Line 141:       # Verify codegen was enabled for all stages of the aggregation.
> Can remove this restriction now that IMPALA-3884 is in master
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4655
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id9dc21d1d676505d3617e1e4f37557397c4fb260
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to