[ 
https://issues.apache.org/jira/browse/HDFS-14308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-14308.
------------------------------------
    Fix Version/s: 3.2.2
                   3.1.4
                   3.3.0
       Resolution: Fixed

Thanks [~zhaoyim] !

> DFSStripedInputStream curStripeBuf is not freed by unbuffer()
> -------------------------------------------------------------
>
>                 Key: HDFS-14308
>                 URL: https://issues.apache.org/jira/browse/HDFS-14308
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ec
>    Affects Versions: 3.0.0
>            Reporter: Joe McDonnell
>            Assignee: Zhao Yi Ming
>            Priority: Major
>             Fix For: 3.3.0, 3.1.4, 3.2.2
>
>         Attachments: ec_heap_dump.png
>
>
> Some users of HDFS cache opened HDFS file handles to avoid repeated 
> roundtrips to the NameNode. For example, Impala caches up to 20,000 HDFS file 
> handles by default. Recent tests on erasure coded files show that the open 
> file handles can consume a large amount of memory when not in use.
> For example, here is output from Impala's JMX endpoint when 608 file handles 
> are cached
> {noformat}
> {
> "name": "java.nio:type=BufferPool,name=direct",
> "modelerType": "sun.management.ManagementFactoryHelper$1",
> "Name": "direct",
> "TotalCapacity": 1921048960,
> "MemoryUsed": 1921048961,
> "Count": 633,
> "ObjectName": "java.nio:type=BufferPool,name=direct"
> },{noformat}
> This shows direct buffer memory usage of 3MB per DFSStripedInputStream. 
> Attached is output from Eclipse MAT showing that the direct buffers come from 
> DFSStripedInputStream objects. Both Impala and HBase call unbuffer() when a 
> file handle is being cached and potentially unused for significant chunks of 
> time, yet this shows that the memory remains in use.
> To support caching file handles on erasure coded files, DFSStripedInputStream 
> should avoid holding buffers after the unbuffer() call. See HDFS-7694. 
> "unbuffer()" is intended to move an input stream to a lower memory state to 
> support these caching use cases. In particular, the curStripeBuf seems to be 
> allocated from the BUFFER_POOL on a resetCurStripeBuffer(true) call. It is 
> not freed until close().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to