Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS7, Line 602: codegen
> It still doesn't track the memory consumed by the LLVM module (e.g. Instruc
The optimisation memory estimate does include the IR, but only after it's 
constructed, since we don't really know how large it will be until it's 
generated.


http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

PS7, Line 809: ctxs.size() > Expr::CODEGEN_INLINE_EXPRS_THRESHOLD
> nit: May be worth refactoring it into a singe variable with meaningful name
Done


http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS7, Line 1706:  if (aggregate_evaluators_.size() > 
Expr::CODEGEN_INLINE_EXPRS_THRESHOLD) {
> As discussed in person, it may be worth double checking whether min / max /
I experimented with a code change that disabled inlining always, and looked at 
the agg node time. I saw: 25% regression for a single sum(decimal), 3% 
regression for a single ndv(decimal) and a 2% speedup for compute stats on 
lineitem. For sum() of only numeric columns in lineitem (8 columns) I saw a 23% 
slowdown.

It could be worth trying to special-case those aggregate functions, but it 
seems like affected queries would be pretty rare. It's also a little tricky 
since the input expressions could be arbitrarily complex.


http://gerrit.cloudera.org:8080/#/c/4956/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

PS4, Line 823: if (!hasGrouping) mergeAggNode.setDisableCodegen(true);
> Thanks. Similar comments about the need to spell out the difference between
There wasn't a particularly natural place to document this general diea. I 
chose to clarify in the thrift structure that disabling codegen means that the 
backend should do it if it supports it. That seemed the best place since it's 
the interface between the FE and BE.


-- 
To view, visit http://gerrit.cloudera.org:8080/4956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to