[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

Joe McDonnell (Code Review) Mon, 08 Apr 2024 11:52:44 -0700

Hello Kurt Deschler, Yida Wu, Alexey Serbin, Michael Smith, Impala Public 
Jenkins,


I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/21171

to look at the new patch set (#9).

Change subject: IMPALA-12905: Disk-based tuple caching
......................................................................

IMPALA-12905: Disk-based tuple caching

This implements on-disk caching for the tuple cache. The
TupleCacheNode uses the TupleFileWriter and TupleFileReader
to write and read back tuples from local files. The file format
uses RowBatch's standard serialization used for KRPC data streams.

The TupleCacheMgr is the daemon-level structure that coordinates
the state machine for cache entries, including eviction. When a
writer is adding an entry, it inserts an IN_PROGRESS entry before
starting to write data. This does not count towards cache capacity,
because the total size is not known yet. This IN_PROGRESS entry
prevents other writers from concurrently writing the same entry.
If the write is successful, the entry transitions to the COMPLETE
state and updates the total size of the entry. If the write is
unsuccessful and a new execution might succeed, then the entry is
removed. If the write is unsuccessful and won't succeed later
(e.g. if the total size of the entry exceeds the max size of an
entry), then it transitions to the TOMBSTONE state. TOMBSTONE
entries avoid the overhead of trying to write entries that are
too large.

Given these states, when a TupleCacheNode is doing its initial
Lookup() call, one of three things can happen:
 1. It can find a COMPLETE entry and read it.
 2. It can find an IN_PROGRESS/TOMBSTONE entry, which means it
    cannot read or write the entry.
 3. It finds no entry and inserts its own IN_PROGRESS entry
    to start a write.

The tuple cache is configured using the tuple_cache parameter,
which is a combination of the cache directory and the capacity
similar to the data_cache parameter. For example, /data/0:100GB
uses directory /data/0 for the cache with a total capacity of
100GB. This currently supports a single directory, but it can
be expanded to multiple directories later if needed. The cache
eviction policy can be specified via the tuple_cache_eviction_policy
parameter, which currently supports LRU or LIRS. The tuple_cache
parameter cannot be specified if allow_tuple_caching=false.

This contains contributions from Michael Smith, Yida Wu,
and Joe McDonnell.

Testing:
 - This adds basic custom cluster tests for the tuple cache.

Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
---
M be/src/exec/CMakeLists.txt
M be/src/exec/parquet/parquet-page-reader.cc
M be/src/exec/tuple-cache-node.cc
M be/src/exec/tuple-cache-node.h
A be/src/exec/tuple-file-read-write-test.cc
A be/src/exec/tuple-file-reader.cc
A be/src/exec/tuple-file-reader.h
A be/src/exec/tuple-file-writer.cc
A be/src/exec/tuple-file-writer.h
M be/src/runtime/CMakeLists.txt
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
A be/src/runtime/tuple-cache-mgr-test.cc
A be/src/runtime/tuple-cache-mgr.cc
A be/src/runtime/tuple-cache-mgr.h
M bin/start-impala-cluster.py
M common/thrift/metrics.json
A tests/custom_cluster/test_tuple_cache.py
18 files changed, 2,255 insertions(+), 16 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/21171/9
--
To view, visit http://gerrit.cloudera.org:8080/21171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
Gerrit-Change-Number: 21171
Gerrit-PatchSet: 9
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <ale...@apache.org>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Yida Wu <wydbaggio...@gmail.com>

[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

Reply via email to