[ 
https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222767#comment-16222767
 ] 

Misha Dmitriev commented on YARN-7386:
--------------------------------------

[~rkanter] could you please look at the test failure above? I cannot reproduce 
it locally, and in any case my change, which is only about interning some 
strings, is the safest possible thing. So I suspect that this is just a flaky 
test.

> Duplicate Strings in various places in Yarn memory
> --------------------------------------------------
>
>                 Key: YARN-7386
>                 URL: https://issues.apache.org/jira/browse/YARN-7386
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>         Attachments: YARN-7386.01.patch, YARN-7386.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a 
> big cluster. The tool uncovered several sources of memory waste. One problem 
> is duplicate strings:
> {code}
> Total strings           Unique strings          Duplicate values       
> Overhead 
>  361,506       86,672  5,928  22,886K (7.6%)
> {code}
> They are spread across a number of locations. The biggest source of waste is 
> the following reference chain:
> {code}
> 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
> ↖{j.u.HashMap}.values
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
> ↖Java Local@3ed9ef820 
> (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
> {code}
> However, there are also many others. Mostly they are strings in proto buffer 
> or proto buffer builder objects. I plan to get rid of at least the worst 
> offenders by inserting String.intern() calls. String.intern() used to consume 
> memory in PermGen and was not very scalable up until about the early JDK 7 
> versions, but has greatly improved since then, and I've used it many times 
> without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to