Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19181 )

Change subject: IMPALA-11470: Add Cache For Codegen Functions
......................................................................


Patch Set 4:

(14 comments)

Thanks Michael for the review. Added checks for abnormal cases in storing the 
cache entry and added a set of the keys to avoid the simultaneous insertion on 
the same key.

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen-cache.h@95
PS3, Line 95:   /// It could be in future to construct from a list of string 
keys.
> nit: default constructor/destructor don't need to be explicitly declared.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@191
PS3, Line 191:   void EnableOptimizations(bool enable);
> This getter is never used, I'd be tempted to omit it. Leaves us more flexib
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@807
PS3, Line 807:   /// Codegen counters
> Not clear why this is no longer const.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/codegen/llvm-codegen.h@933
PS3, Line 933:   /// llvm constants to help with code gen verbosity
> Why is this a unique_ptr? You could inline the object and it would be impli
The CodeGenCache instance is a singleton in the daemon, while the constructor 
generates the key for each entry in a fragment level.

Changed a way to simplify the CodeGenCacheKeyConstructor, because currently it 
only contains one string as the key, changed to use a static function, so we 
don't need to consider the member things.

For the naming, it looks okay to me to use a name CodeGenCacheKeyConstructor in 
the LlvmCodeGen, because it generates the key with type CodeGenCacheKey for 
cache use only, I don't see why we can't have the cache in the name. But if it 
is too odd, I can change it.


http://gerrit.cloudera.org:8080/#/c/19181/4/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/19181/4/be/src/codegen/llvm-codegen.cc@1952
PS4, Line 1952:     out << "Fragment Plan: " << 
apache::thrift::ThriftDebugString(state_->fragment())
> Please make sure you don't output sensitive data with ThriftDebugString().
As mentioned in patch 2, the using of ThriftDebugString here should be fine.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h
File be/src/exprs/scalar-expr.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/exprs/scalar-expr.h@254
PS3, Line 254:   bool AllowCodegenCache() const { return allow_codegen_cache_; }
> nit: const function.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h
File be/src/runtime/exec-env.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.h@156
PS3, Line 156:   bool codegen_cache_enabled() const { return codegen_cache_ != 
nullptr; }
> Should be const. All these getters should be const, and might be a good opp
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/exec-env.cc@104
PS3, Line 104:     "Specify the capacity of the codegen cache. If set to 0, 
codegen cache is disabled.");
> What kind of numeric parsing does this support? Does Impala have something
It is parsed by ParseMemSpec() which is used for parsing the size everywhere, 
it could be numbers or strings that contain the number and the unit. Percentage 
can be passed by ParseMemSpec(), but in this case, we don't support percentage, 
after passed, percentage would be 0.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h
File be/src/runtime/fragment-state.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@75
PS3, Line 75:   bool codegen_cache_enabled() const {
> This should be a const function.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@92
PS3, Line 92:   bool AllowCodegenCache() const { return allow_codegen_cache_; }
> This should be a const function.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/fragment-state.h@231
PS3, Line 231:   /// Indicates whether codegen cache is allowed. If the 
fragment contains udf
> "Normally" implies there are circumstances where a fragment containing udf
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/krpc-data-stream-sender.cc@93
PS3, Line 93:   state->CheckAndAddCodegenDisabledMessage(codegen_status_msgs_);
> This reverts any improvements from IMPALA-8005. Can we replace query_id wit
Thanks for pointing out this. I moved the seed out of the codegen function, 
passed into the function as an argument. I reran the tpch performance test, and 
didn't see any significant difference. I am thinking there could be other 
places using the seed or options in the codegen function that could reduce the 
hit ratio of the cache, maybe it is the next step to move them out of the 
codegen if this method works.


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@154
PS3, Line 154:   bool codegen_cache_enabled() const;
> This seems like it should be a const function.
Done


http://gerrit.cloudera.org:8080/#/c/19181/3/be/src/runtime/query-state.h@468
PS3, Line 468:   bool disable_codegen_cache_ = false;
> We're going to enable the codegen cache by default? This seems like somethi
I think it follows the norm of disable_codegen in the query option, even the 
codegen is enabled by default, it means to me that you need to do something to 
disable the codegen cache, otherwise it will be enabled.



--
To view, visit http://gerrit.cloudera.org:8080/19181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1
Gerrit-Change-Number: 19181
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Comment-Date: Mon, 07 Nov 2022 02:23:23 +0000
Gerrit-HasComments: Yes

Reply via email to