Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19181 )

Change subject: IMPALA-11470: Add Cache For Codegen Functions
......................................................................


Patch Set 6:

(10 comments)

Thanks Michael and Joe for the review.
This patch mainly changes the hash code of a CodeGenCacheKey to 128bits to 
reduce the collision chance and fixes a memory leak issue during cache entry 
eviction due to unable to release a const shared pointer in the entry.

http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.h@70
PS5, Line 70:     size_t operator()(const HashCode& h) const {
> So this takes the first 8 bytes of the key and returns them?
Yes, set the first sizeof(HashCode) bytes of the key as the hash code.


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.h@81
PS5, Line 81: 
> It looks like everywhere you use this, you convert it to a slice. So maybe
Done


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.h@92
PS5, Line 92:   void set_key(string key) { key_.swap(key); }
> I'm confused why there's a constructor class at all. Why isn't this the con
I think the main purpose of setting a Constructor is to be able to collect and 
store the data from different objects like fragment-state or query-state to 
finally generate a key via some complex computation, while keeping the 
CodeGenCacheKey as a simpler structure. Right now, only the bitcode string is 
needed for the key, but the next version might include more things to replace 
the bitcode string, so I prefer to keep the Constructor as an entry to 
construct the key for now.


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.h@195
PS5, Line 195:   void RollbackHitCount() {
> nit: don't need to say "simultaneously" twice. "The purpose of it is to pre
Done


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.cc
File be/src/codegen/llvm-codegen-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.cc@51
PS5, Line 51:   }
> nit: if you want to guarantee a pointer can't be null, pass a reference ins
The codebase convention is to use by pointer instead of by reference for an 
output parameter to make it more obvious. (Even though not all the places did)
https://gerrit.cloudera.org/#/c/16318/24/be/src/runtime/tmp-file-mgr-internal.h@386


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.cc@54
PS5, Line 54: void CodeGenCacheKeyConstructor::construct(
> This is the only place GeneralElementSize is used, and the name isn't very
It could be useful if we have a key analyzer later. Changed the name to 
ElementLengthType, hope it looks clearer.


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.cc@66
PS5, Line 66:   string result;
> I think we reduce the randomness in the hash by including 8 null bytes at t
Done


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen-cache.cc@122
PS5, Line 122:         reinterpret_cast<const 
CodeGenCacheEntry*>(cache_->Value(handle).data());
> My biggest concern is what happens during hash collisions. This patch doesn
This is a good idea. Changed to use MurmurHash3_x64_128 and added comments. 
From the previous discussion, we may change to use the full bitcode which is 
debug mode if collision is found or simply disable the cache, but the chance 
should be quite small. I think currently the problem is the bitcode string is 
too large, if we have a smaller key in the next version, we may use that to 
replace hash code in non-debug mode.


http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/19181/5/be/src/codegen/llvm-codegen.cc@1137
PS5, Line 1137:     // Get pointers to all codegen'd functions.
              :     for (int i = 0; i < fns_to_jit_compile_.size(); ++i) {
> Could we fall back to regular codegen path if we encounter this?
Changed to log a warning. But found an issue when using getFunctionAddress with 
a non-existent function name, it will hit assertion "report_fatal_error called 
with success value" within the execution engine , tested it in the old impala 
without cache, same. Looks like a bug in llvm 5 to me, not sure if llvm 16 
would be good. But for the codegen cache, the engine doesn't expect to have 
non-existent function names unless we are getting a wrong cache with the same 
key (hash code). One solution to me is adding function names as part of the key 
to make sure an engine from a cache hit should have all the functions, or 
putting names in the cache entry, double check them before calling 
getFunctionAddress.


http://gerrit.cloudera.org:8080/#/c/19181/4/tests/custom_cluster/test_codegen_cache.py
File tests/custom_cluster/test_codegen_cache.py:

http://gerrit.cloudera.org:8080/#/c/19181/4/tests/custom_cluster/test_codegen_cache.py@27
PS4, Line 27: class TestCodegenCache(CustomClusterTestSuite):
> It would be nice to have some negative test cases for the times when we dis
Seems a bit difficult to detect because it is fragment level disabled instead 
of query level, and for simplest case like "select udf(x)", the udf is called 
and output a const value before reaching the executor, so it doesn't contain 
the udf function in the codegen cache. But from my manual tests on several 
cases, they look good. Will add testcases for this later.



--
To view, visit http://gerrit.cloudera.org:8080/19181
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c78a7f51fd582e5fe331fead494dadf544eb1
Gerrit-Change-Number: 19181
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Comment-Date: Tue, 15 Nov 2022 18:44:11 +0000
Gerrit-HasComments: Yes

Reply via email to