[jira] [Resolved] (HDFS-11661) GetContentSummary uses excessive amounts of memory

Wei-Chiu Chuang (JIRA) Wed, 24 May 2017 18:24:21 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-11661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wei-Chiu Chuang resolved HDFS-11661.
------------------------------------
       Resolution: Fixed
     Hadoop Flags: Reviewed
    Fix Version/s: 2.8.1
                   3.0.0-alpha3
     Release Note: Reverted HDFS-10797 to fix a scalability regression brought 
by the commit.

Based on multiple +1, I reverted the commit from branch-2.8, branch-2 and trunk.

Thanks to [~nroberts] for reporting the issue, and comments from [~kihwal], 
[~mackrorysd], [~xiaochen] [~djp] [~andrew.wang] [~shahrs87] [~yzhangal] and 
[~daryn].

[~daryn] thanks for your effort trying to fix the bug. Please file a new jira 
for your patch. Thanks!

> GetContentSummary uses excessive amounts of memory
> --------------------------------------------------
>
>                 Key: HDFS-11661
>                 URL: https://issues.apache.org/jira/browse/HDFS-11661
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 2.8.0, 3.0.0-alpha2
>            Reporter: Nathan Roberts
>            Assignee: Wei-Chiu Chuang
>            Priority: Blocker
>             Fix For: 3.0.0-alpha3, 2.8.1
>
>         Attachments: HDFS-11661.001.patch, HDFs-11661.002.patch, Heap 
> growth.png
>
>
> ContentSummaryComputationContext::nodeIncluded() is being used to keep track 
> of all INodes visited during the current content summary calculation. This 
> can be all of the INodes in the filesystem, making for a VERY large hash 
> table. This simply won't work on large filesystems. 
> We noticed this after upgrading a namenode with ~100Million filesystem 
> objects was spending significantly more time in GC. Fortunately this system 
> had some memory breathing room, other clusters we have will not run with this 
> additional demand on memory.
> This was added as part of HDFS-10797 as a way of keeping track of INodes that 
> have already been accounted for - to avoid double counting.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (HDFS-11661) GetContentSummary uses excessive amounts of memory

Reply via email to