Hello Marcel Kornacker,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6478

to look at the new patch set (#19).

Change subject: IMPALA-4623: Enable file handle cache
......................................................................

IMPALA-4623: Enable file handle cache

Currently, every scan range maintains a file handle, even
when multiple scan ranges are accessing the same file.
Opening the file handles causes load on the NameNode, which
can lead to scaling issues.

There are two parts to this transaction:
1. Enable file handle caching by default for local files
2. Share the file handle between scan ranges from the same
file

Local scan ranges no longer maintain their own Hdfs file
handles. On each read, the io thread will get the Hdfs file
handle from the cache (opening it if necessary) and use
that for the read. This allows multiple scan ranges on the
same file to use the same file handle. Since the file
offsets are no longer consistent for an individual scan
range, all Hdfs reads need to either use hdfsPread or do
a seek before reading. Additionally, since Hdfs read
statistics are maintained on the file handle, the read
statistics must be retrieved and cleared after each read.

To manage contention, the file handle cache is now
partitioned by a hash of the key into independent
caches with independent locks. The allowed capacity
of the file handle cache is split evenly among the
partitions. File handles are evicted independently
for each partition. The file handle cache maintains
ownership of the file handles at all times, but it
will not evict a file handle that is in use.

If max_cached_file_handles is set to 0 or the the
scan range is accessing data cached by Hdfs or the
scan range is remote, the scan range will get a
file handle from the cache and hold it until the
scan range is closed. This mimics the existing behavior,
except the file handle stays in the cache and is owned
by the cache. Since it is in use, it will not be evicted.

If a file handle in the cache becomes invalid,
it may result in Read() calls failing. Consequently,
if Read() encounters an error using a file handle
from the cache, it will destroy the handle and
retry once with a new file handle. Any subsequent
error is unrelated to the file handle cache and
will be returned.

Tests:
query_test/test_hdfs_fd_caching.py copies the files from
an existing table into a new directory and uses that to
create an external table. It queries the external table,
then uses the hdfs commandline to manipulate the hdfs file
(delete, move, etc). It queries again to make sure we
don't crash. Then, it runs "invalidate metadata". It
checks the row counts before the modification and after
"invalidate metadata", but it does not check the results
in between.
custom_cluster/test_hdfs_fd_caching.py starts up a cluster
with a small file handle cache size. It verifies that a
file handle can be reused (i.e. rerunning a query does
not result in more file handles cached). It also verifies
that the cache capacity is enforced.

Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/buffered-block-mgr.h
A be/src/runtime/disk-io-mgr-handle-cache.h
A be/src/runtime/disk-io-mgr-handle-cache.inline.h
M be/src/runtime/disk-io-mgr-internal.h
M be/src/runtime/disk-io-mgr-reader-context.cc
M be/src/runtime/disk-io-mgr-scan-range.cc
M be/src/runtime/disk-io-mgr.cc
M be/src/runtime/disk-io-mgr.h
M tests/custom_cluster/test_hdfs_fd_caching.py
M tests/metadata/test_refresh_partition.py
M tests/query_test/test_hdfs_fd_caching.py
13 files changed, 861 insertions(+), 418 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/6478/19
-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 19
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to