Tim Armstrong has posted comments on this change. Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory ......................................................................
Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: PS7, Line 602: codegen > It still doesn't track the memory consumed by the LLVM module (e.g. Instruc The optimisation memory estimate does include the IR, but only after it's constructed, since we don't really know how large it will be until it's generated. http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: PS7, Line 809: ctxs.size() > Expr::CODEGEN_INLINE_EXPRS_THRESHOLD > nit: May be worth refactoring it into a singe variable with meaningful name Done http://gerrit.cloudera.org:8080/#/c/4956/7/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS7, Line 1706: if (aggregate_evaluators_.size() > Expr::CODEGEN_INLINE_EXPRS_THRESHOLD) { > As discussed in person, it may be worth double checking whether min / max / I experimented with a code change that disabled inlining always, and looked at the agg node time. I saw: 25% regression for a single sum(decimal), 3% regression for a single ndv(decimal) and a 2% speedup for compute stats on lineitem. For sum() of only numeric columns in lineitem (8 columns) I saw a 23% slowdown. It could be worth trying to special-case those aggregate functions, but it seems like affected queries would be pretty rare. It's also a little tricky since the input expressions could be arbitrarily complex. http://gerrit.cloudera.org:8080/#/c/4956/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java: PS4, Line 823: if (!hasGrouping) mergeAggNode.setDisableCodegen(true); > Thanks. Similar comments about the need to spell out the difference between There wasn't a particularly natural place to document this general diea. I chose to clarify in the thrift structure that disabling codegen means that the backend should do it if it supports it. That seemed the best place since it's the interface between the FE and BE. -- To view, visit http://gerrit.cloudera.org:8080/4956 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-HasComments: Yes