Yida Wu has posted comments on this change. ( http://gerrit.cloudera.org:8080/19181 )
Change subject: IMPALA-11470: Add Cache For Codegen Functions ...................................................................... Patch Set 4: (14 comments) Thanks Michael for the review. Added checks for abnormal cases in storing the cache entry and added a set of the keys to avoid the simultaneous insertion on the same key. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h File be/src/codegen/llvm-codegen-cache.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h@95 PS3, Line 95: /// It could be in future to construct from a list of string keys. > nit: default constructor/destructor don't need to be explicitly declared. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@191 PS3, Line 191: void EnableOptimizations(bool enable); > This getter is never used, I'd be tempted to omit it. Leaves us more flexib Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@807 PS3, Line 807: /// Codegen counters > Not clear why this is no longer const. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@933 PS3, Line 933: /// llvm constants to help with code gen verbosity > Why is this a unique_ptr? You could inline the object and it would be impli The CodeGenCache instance is a singleton in the daemon, while the constructor generates the key for each entry in a fragment level. Changed a way to simplify the CodeGenCacheKeyConstructor, because currently it only contains one string as the key, changed to use a static function, so we don't need to consider the member things. For the naming, it looks okay to me to use a name CodeGenCacheKeyConstructor in the LlvmCodeGen, because it generates the key with type CodeGenCacheKey for cache use only, I don't see why we can't have the cache in the name. But if it is too odd, I can change it. http://gerrit.cloudera.org:8080/#/c/19181/4/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: http://gerrit.cloudera.org:8080/#/c/19181/4/be/src/codegen/llvm-codegen.cc@1952 PS4, Line 1952: out << "Fragment Plan: " << apache::thrift::ThriftDebugString(state_->fragment()) > Please make sure you don't output sensitive data with ThriftDebugString(). As mentioned in patch 2, the using of ThriftDebugString here should be fine. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h File be/src/exprs/scalar-expr.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h@254 PS3, Line 254: bool AllowCodegenCache() const { return allow_codegen_cache_; } > nit: const function. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h File be/src/runtime/exec-env.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h@156 PS3, Line 156: bool codegen_cache_enabled() const { return codegen_cache_ != nullptr; } > Should be const. All these getters should be const, and might be a good opp Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc@104 PS3, Line 104: "Specify the capacity of the codegen cache. If set to 0, codegen cache is disabled."); > What kind of numeric parsing does this support? Does Impala have something It is parsed by ParseMemSpec() which is used for parsing the size everywhere, it could be numbers or strings that contain the number and the unit. Percentage can be passed by ParseMemSpec(), but in this case, we don't support percentage, after passed, percentage would be 0. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h File be/src/runtime/fragment-state.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@75 PS3, Line 75: bool codegen_cache_enabled() const { > This should be a const function. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@92 PS3, Line 92: bool AllowCodegenCache() const { return allow_codegen_cache_; } > This should be a const function. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@231 PS3, Line 231: /// Indicates whether codegen cache is allowed. If the fragment contains udf > "Normally" implies there are circumstances where a fragment containing udf Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc@93 PS3, Line 93: state->CheckAndAddCodegenDisabledMessage(codegen_status_msgs_); > This reverts any improvements from IMPALA-8005. Can we replace query_id wit Thanks for pointing out this. I moved the seed out of the codegen function, passed into the function as an argument. I reran the tpch performance test, and didn't see any significant difference. I am thinking there could be other places using the seed or options in the codegen function that could reduce the hit ratio of the cache, maybe it is the next step to move them out of the codegen if this method works. http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@154 PS3, Line 154: bool codegen_cache_enabled() const; > This seems like it should be a const function. Done http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@468 PS3, Line 468: bool disable_codegen_cache_ = false; > We're going to enable the codegen cache by default? This seems like somethi I think it follows the norm of disable_codegen in the query option, even the codegen is enabled by default, it means to me that you need to do something to disable the codegen cache, otherwise it will be enabled. -- To view, visit http://gerrit.cloudera.org:8080/19181 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1 Gerrit-Change-Number: 19181 Gerrit-PatchSet: 4 Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com> Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com> Gerrit-Comment-Date: Mon, 07 Nov 2022 02:23:23 +0000 Gerrit-HasComments: Yes