Michael Ho has posted comments on this change.

Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory
......................................................................


Patch Set 4:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 952:     RETURN_IF_ERROR(OptimizeModule());
Is there a reason why we don't call DestroyIRModule() when OptimizeModule() 
fails ?


PS4, Line 986:  memory_manager_->bytes_allocated()),
             :         memory_manager_->bytes_allocated()
The code may be easier to read if this factored into a single variable instead.


Line 1039:   if (!mem_tracker_->TryConsume(estimated_memory)) {
I wonder if it's better to skip optimization if we cannot reserve the memory to 
do so than punting back to interpretation ?


PS4, Line 1041:         Substitute("Codegen failed to reserve '$0' bytes for 
optimization",
              :                                               estimated_memory),
              :         estimated_memory);
These line wraps look a bit weird to me but not sure if it's a side effect of 
clang-format ?


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS4, Line 708: optimiser
nit: optimizer


PS4, Line 709:  512 bytes
Mind documenting how this is derived so we can update it accordingly in case we 
change the optimization level in the future ?


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/codegen/mcjit-mem-mgr.h
File be/src/codegen/mcjit-mem-mgr.h:

PS4, Line 36: memory
This is for tracking memory consumed by the compiled machine code, right ?


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

PS4, Line 187:  false
true ?


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

PS4, Line 762:     if (ctxs.size() > Expr::CODEGEN_INLINE_EXPRS_THRESHOLD) {
             :       // Avoid bloating function by inlining too many exprs into 
it.
             :       expr_fn->addFnAttr(llvm::Attribute::NoInline);
             :     }
I wonder if this can be more generalized by putting it in 
LlvmCodegen::FinalizeFunction() instead and use instruction count to determine 
whether the function is too large to be worth being inlined.

Of course, this may be a problem if we intentionally inline a big function  
into the cross-compiled code and rely on constant folding to eliminate 90% of 
the instructions in the function. In which case, we may need to do a quick 
constant propagation pass first in FinalizeFunction(). Just a thought.


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS4, Line 190: CODEGEN_WRITE_TUPLES_MAX_COLUMNS = 250;
This seems a bit adhoc to me. May be better to track the size of IR functions 
and don't use it if it gets too large. Empirically, the time to generate the IR 
is rather small compared to the optimization time so the wasted work may not be 
too costly and hopefully, the large IR function is not a common case.


PS4, Line 190:  const
constexpr


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

PS4, Line 85: 
Oops. My bad.


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS4, Line 1911: 
              : 
Is it necessary to remove it ? For comparison, please see FirstValUpdate().


http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exprs/expr.h
File be/src/exprs/expr.h:

PS4, Line 311: const
constexpr ?


PS4, Line 311: CODEGEN_INLINE_EXPRS_THRESHOLD
Wouldn't instruction count be a more precise estimation ?


-- 
To view, visit http://gerrit.cloudera.org:8080/4956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to