Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15306 )

Change subject: IMPALA-8690: Add LIRS cache eviction algorithm
......................................................................


Patch Set 9:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/15306/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15306/9//COMMIT_MSG@17
PS9, Line 17: the
> the startup flag
Done


http://gerrit.cloudera.org:8080/#/c/15306/9//COMMIT_MSG@36
PS9, Line 36: Testing
> do we need any test updates for test_data_cache.py, cache-bench, or data-ca
In this upload, I modified data-cache-test and 
custom_cluster/test_data_cache.py to do both LRU and LIRS. The modifications 
were mostly some minor tweaks.

Cache bench now takes a parameter for the eviction policy.


http://gerrit.cloudera.org:8080/#/c/15306/9//COMMIT_MSG@38
PS9, Line 38:  - Hand tested Impala with the data cache and 
data_cache_eviction_policy="LIRS"
> have we done any tpch / tpcds testing?
I ran concurrent TPC-DS with LIRS on my desktop with a small cache size as a 
stress test to hit corner cases.

In a deterministic run, the hit bytes and miss bytes were similar.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-internal.h
File be/src/util/cache/cache-internal.h:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-internal.h@205
PS9, Line 205: evictin_policy
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-test.h
File be/src/util/cache/cache-test.h:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-test.h@38
PS9, Line 38: class CacheBaseTest : public ::testing::Test,
> would good to add some docs here explaining how these tests working. looks
Added some comments about how this is used


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-test.h@74
PS9, Line 74: SetupWithParameters
> nit: some docs here would be nice
Added a comment


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache-test.h@78
PS9, Line 78:   std::vector<int> evicted_keys_;
            :   std::vector<int> evicted_values_;
> not related to this patch, but would be good to document these
Added comments


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.h
File be/src/util/cache/cache.h:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.h@42
PS9, Line 42: FIFO
> is there a benefit to keeping the code for the FIFO eviction policy?
We will need to decide if this Cache implementation would be used by things 
other than the data cache. Kudu uses it for a variety of things, but that may 
be specific to Kudu. If it is generic, then we might keep FIFO around in case 
we ever want it.

The data cache doesn't care about FIFO, so we could remove it.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.h@47
PS9, Line 47: interreference
> nit: inter-reference
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.cc
File be/src/util/cache/cache.cc:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.cc@39
PS9, Line 39: DECLARE_double(cache_memtracker_approximation_ratio);
> can remove this?
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/cache.cc@87
PS9, Line 87:   case Cache::EvictionPolicy::FIFO:
> nit: formatting
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache-test.cc
File be/src/util/cache/lirs-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache-test.cc@40
PS9, Line 40: public:
> formatting
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache-test.cc@50
PS9, Line 50: protected:
> formatting
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@42
PS9, Line 42: DEFINE_double_hidden(lirs_tombstone_multiple, 2.00,
> I don't see anywhere you validate that this value is reasonable. Might be g
Changed the Cache to have an Init function that must be called. This now 
requires the multiple to be > 0.5. I don't set an upper bound for the 
parameter, but the upper bound doesn't strongly impact performance or 
correctness.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@89
PS9, Line 89: // oldest PROTECTED entry is demoted to UNPROTECTED.
> Might be nice to also include the semantics of UNINIT in this paragraph.
Added a description of UNINIT.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@93
PS9, Line 93: last entry
> I think some of the comments around the ordering of the lists get a little
I changed from calling it a recency queue and changed it to a recency list. I 
changed the language to talk about the oldest entry/entries, rather than last 
or other positional language.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@100
PS9, Line 100: cache
> list
Done


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@241
PS9, Line 241: DCHECK_LT(cur_state.ref_count, 
std::numeric_limits<uint16_t>::max());
> Is there anything that guarantees this won't be the case, i.e. as written w
There is nothing that stops someone from doing this. I added comments to 
Lookup/Insert to note that callers shouldn't hold handles for long periods of 
time or open multiple references to the same entry.

I'm reluctant to handle this in a non-fatal way. Hitting 65k references is more 
likely a leak than a use case. The cache doesn't enforce capacity correctly if 
there are callers that hold references for a long time and those entries get 
evicted.

For the data cache, each IO thread is holding one handle at a time. So, we 
should have a very limited number of references.

I changed this to a non-debug CHECK_LT.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@242
PS9, Line 242: ++
> nit: its impala convention to always prefer this at the beginning, i.e. '++
Fixed this wherever I could find it in these files.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/lirs-cache.cc@466
PS9, Line 466: MoveToQueueBack
> MoveToRecencyQueueBack
Changed to MoveToRecencyListBack


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/rl-cache-test.cc
File be/src/util/cache/rl-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/rl-cache-test.cc@27
PS9, Line 27: Invalidate
> is this something we plan to support for LIRS in the future?
I think the data cache may want to use it. I'm about to file a JIRA, but the 
data cache will sometimes delete a data cache file without removing the 
entries. It would be nice to go invalidate the cache entries.


http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/rl-cache.cc
File be/src/util/cache/rl-cache.cc:

http://gerrit.cloudera.org:8080/#/c/15306/9/be/src/util/cache/rl-cache.cc@49
PS9, Line 49:
> should the class declarations be declared in their own header file?
This structure is a holdover from the original cache implementation. It uses an 
anonymous namespace (see line 46) to hide the implementation from callers so 
they don't use it directly. The anonymous namespace allows declarations that 
can only be used in this file. So, callers are forced to go through the public 
Cache API rather than calling this directly.

I'm not sure we need that type of hiding. I'm thinking I'll keep it for this 
code change, but we could remove it.



--
To view, visit http://gerrit.cloudera.org:8080/15306
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I670fa4b2b7c93998130dc4e8b2546bb93e9a84f8
Gerrit-Change-Number: 15306
Gerrit-PatchSet: 9
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Comment-Date: Fri, 13 Mar 2020 22:47:02 +0000
Gerrit-HasComments: Yes

Reply via email to