Hello Philip Zeyliger, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12552

to look at the new patch set (#2).

Change subject: IMPALA-8178: Disable file handle cache for HDFS erasure coded 
files
......................................................................

IMPALA-8178: Disable file handle cache for HDFS erasure coded files

Testing on an erasure coded minicluster has revealed that each
file handle for an erasure coded files uses about 3MB of native
memory. This shows up as "java.nio:type=BufferPool,name=direct"
in the /jmx endpoint (here showing the output when 608 handles
are open):

{
  "name": "java.nio:type=BufferPool,name=direct",
  "modelerType": "sun.management.ManagementFactoryHelper$1",
  "Name": "direct",
  "TotalCapacity": 1921048960,
  "MemoryUsed": 1921048961,
  "Count": 633,
  "ObjectName": "java.nio:type=BufferPool,name=direct"
}

The memory is not released or reduced by a call to unbuffer(),
so these file handles are not suitable for long term caching.
HDFS-14308 tracks the implementation of unbuffer() for
DFSStripedInputStream. This issue showed up when remote
file handle caching was enabled in IMPALA-7265, as erasure
coded files are always scheduled to be remote (IMPALA-7019).

This disables file handle caching for erasure coded files,
which requires plumbing through the information about which
ScanRanges are accessing erasure coded files.

With this change, core tests pass on an erasure coded system.

Change-Id: I8c761e08aacc952de0033a4c91e07f15c8ec96da
---
M be/src/exec/base-sequence-scanner.cc
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-text-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/scanner-context.cc
M be/src/runtime/io/disk-io-mgr-stress.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
M be/src/runtime/tmp-file-mgr.cc
M be/src/scheduling/scheduler.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
17 files changed, 62 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/12552/2
--
To view, visit http://gerrit.cloudera.org:8080/12552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c761e08aacc952de0033a4c91e07f15c8ec96da
Gerrit-Change-Number: 12552
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>

Reply via email to