Hello Todd Lipcon, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13221

to look at the new patch set (#2).

Change subject: IMPALA-8428: Add support for caching file handles on s3
......................................................................

IMPALA-8428: Add support for caching file handles on s3

This patch is based on work done by Joe McDonnell. This change adds
support for cacheing file handles from S3. It add a new configuration
flag 'cache_s3_file_handles' (set to true by default) which controls
whether or not cacheing of S3 file handles is enabled.

The S3 file handle cache is dependent on HADOOP-14747 (S3AInputStream to
implement CanUnbuffer). HADOOP-14747 adds support for hdfsUnbufferFile
to S3A streams. The call to unbuffer closes the underlying S3 object
stream. Without this change the S3 file handle cache would quickly cause
an impalad to crash because all S3 file handles in the cache would have
a dangling HTTP(S) connection open to S3.

Testing:
* Modified test_hdfs_fd_caching.py so it is enabled for S3 as well as
remote HDFS
* Ran core tests
* Ran TPC-DS on a real cluster and validated that the S3 file handle
cache works as expected
* Ran several test queries on a real cluster with S3Guard enabled and
validated that the S3 file handle cache works as expected

Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19
---
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/scan-range.cc
M tests/custom_cluster/test_hdfs_fd_caching.py
3 files changed, 11 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/13221/2
--
To view, visit http://gerrit.cloudera.org:8080/13221
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5b304d37bc724377fbe7955441cce0cec6fb7f19
Gerrit-Change-Number: 13221
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to