[ https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16960045#comment-16960045 ]
Hudson commented on HDFS-14308: ------------------------------- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17575 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17575/]) HDFS-14308. DFSStripedInputStream curStripeBuf is not freed by (weichiu: rev 30db895b59d250788d029cb2013bb4712ef9b546) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSStripedInputStream.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedInputStream.java > DFSStripedInputStream curStripeBuf is not freed by unbuffer() > ------------------------------------------------------------- > > Key: HDFS-14308 > URL: https://issues.apache.org/jira/browse/HDFS-14308 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec > Affects Versions: 3.0.0 > Reporter: Joe McDonnell > Assignee: Zhao Yi Ming > Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: ec_heap_dump.png > > > Some users of HDFS cache opened HDFS file handles to avoid repeated > roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file > handles by default. Recent tests on erasure coded files show that the open > file handles can consume a large amount of memory when not in use. > For example, here is output from Impala's JMX endpoint when 608 file handles > are cached > {noformat} > { > "name": "java.nio:type=BufferPool,name=direct", > "modelerType": "sun.management.ManagementFactoryHelper$1", > "Name": "direct", > "TotalCapacity": 1921048960, > "MemoryUsed": 1921048961, > "Count": 633, > "ObjectName": "java.nio:type=BufferPool,name=direct" > },{noformat} > This shows direct buffer memory usage of 3MB per DFSStripedInputStream. > Attached is output from Eclipse MAT showing that the direct buffers come from > DFSStripedInputStream objects. Both Impala and HBase call unbuffer() when a > file handle is being cached and potentially unused for significant chunks of > time, yet this shows that the memory remains in use. > To support caching file handles on erasure coded files, DFSStripedInputStream > should avoid holding buffers after the unbuffer() call. See HDFS-7694. > "unbuffer()" is intended to move an input stream to a lower memory state to > support these caching use cases. In particular, the curStripeBuf seems to be > allocated from the BUFFER_POOL on a resetCurStripeBuffer(true) call. It is > not freed until close(). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org