[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-8872: - Attachment: YARN-8872.02.patch > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: YARN-8872.01.patch, YARN-8872.02.patch, > jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648270#comment-16648270 ] Misha Dmitriev commented on YARN-8872: -- [~pbacsko] I think the situation here is the same as before. Both before and after this change, the {{size()}} method can never see {{map}} in a really inconsistent (half-constructed) state, because this object (a {{ConcurrentSkipListMap)}} is first fully constructed, and then the {{map}} reference is set to point to it. You are right that if {{findCounter()}} and {{size()}} run concurrently after that point, then the first method can keep adding objects to {{map}} and the second one may iterate a smaller number of objects (or none at all) and return a smaller size. But the same thing could happen before this change. Note also that since this is a concurrent map implementation, iterating and adding/removing elements concurrently is safe (will not cause exceptions). According to the javadoc of {{ConcurrentSkipListMap.values()}}, "The view's {{iterator}} is a "weakly consistent" iterator that will never throw [{{ConcurrentModificationException}}|https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html], and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction." However, making {{size()}} synchronized will still make the code a little more predictable, at least in tests if nothing else. So I can make this change if you would like. > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: YARN-8872.01.patch, jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648235#comment-16648235 ] Misha Dmitriev commented on YARN-8872: -- I would leave this decision to [~haibochen]. > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: YARN-8872.01.patch, jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648225#comment-16648225 ] Misha Dmitriev commented on YARN-8872: -- Regarding the problems in Hadoop QA report above: # No tests are added because this is a performance improvement, no change in functionality # I believe there is no problem with synchronization in FileSystemCounterGroup.java. The {{map}} object is created lazily in the synchronized method {{findCounter()}}, so according to the Java Memory Model, once it's created, it's visible to all the code, both synchronized and unsynchronized. In other words, the unsynchronized method {{write()}} (line 281 that findbugs complains about) will never think that {{map == null}} if {{map}} has actually been initialized. In other aspects it will work same as before. > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: YARN-8872.01.patch, jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-8872: - Attachment: YARN-8872.01.patch > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: YARN-8872.01.patch, jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-8872: - Description: We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big heap in a large clusters, handling large MapReduce jobs. The heap is large (over 32GB) and 21.4% of it is wasted due to various suboptimal Java collections, mostly maps and lists that are either empty or contain only one element. In such under-populated collections considerable amount of memory is still used by just the internal implementation objects. See the attached excerpt from the jxray report for the details. If certain collections are almost always empty, they should be initialized lazily. If others almost always have just 1 or 2 elements, they should be initialized with the appropriate initial capacity of 1 or 2 (the default capacity is 16 for HashMap and 10 for ArrayList). Based on the attached report, we should do the following: # {{FileSystemCounterGroup.map}} - initialize lazily # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks only have one or two attempts # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it contains one diagnostic message most of the time # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use the more wasteful LinkedList here) and initialize with capacity 1. was: We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big heap in a large clusters, handling large MapReduce jobs. The heap is large (over 32GB) and 21.4% of it is wasted due to various suboptimal Java collections, mostly maps and lists that are either empty or contain only one element. In such under-populated collections considerable amount of memory is still used by just the internal implementation objects. See the attached excerpt from the jxray report for the details. If certain collections are almost always empty, they should be initialized lazily. If others almost always have just 1 or 2 elements, they should be initialized with the appropriate initial capacity, which is much smaller than e.g. the default 16 for HashMap and 10 for ArrayList. Based on the attached report, we should do the following: # {{FileSystemCounterGroup.map}} - initialize lazily # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks only have one or two attempts # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it contains one diagnostic message most of the time # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use the more wasteful LinkedList here) and initialize with capacity 1. > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity of 1 or 2 (the default capacity is 16 for > HashMap and 10 for ArrayList). > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For ad
[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
[ https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-8872: - Description: We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big heap in a large clusters, handling large MapReduce jobs. The heap is large (over 32GB) and 21.4% of it is wasted due to various suboptimal Java collections, mostly maps and lists that are either empty or contain only one element. In such under-populated collections considerable amount of memory is still used by just the internal implementation objects. See the attached excerpt from the jxray report for the details. If certain collections are almost always empty, they should be initialized lazily. If others almost always have just 1 or 2 elements, they should be initialized with the appropriate initial capacity, which is much smaller than e.g. the default 16 for HashMap and 10 for ArrayList. Based on the attached report, we should do the following: # {{FileSystemCounterGroup.map}} - initialize lazily # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks only have one or two attempts # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it contains one diagnostic message most of the time # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use the more wasteful LinkedList here) and initialize with capacity 1. was: We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big heap in a large clusters, handling large MapReduce jobs. The heap is large (over 32GB) and 21.4% of it is wasted due to various suboptimal Java collections, mostly maps and lists that are either empty or contain only one element. In such under-populated collections considerable amount of memory is still used by just the internal implementation objects. See the attached excerpt from the jxray report for the details. If certain collections are almost always empty, they should be initialized lazily. If others almost always have just 1 or 2 elements, they should be initialized with the appropriate initial capacity, which is much smaller than e.g. the default 16 for HashMap and 10 for ArrayList. Based on the attached report, we should do the following: # {{FileSystemCounterGroup.map}} - initialize lazily # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks only have one or two attempts # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity 2 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it contains one diagnostic message most of the time. # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use the more wasteful LinkedList here) and initialize with capacity 1. > Optimize collections used by Yarn JHS to reduce its memory > -- > > Key: YARN-8872 > URL: https://issues.apache.org/jira/browse/YARN-8872 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev >Priority: Major > Attachments: jhs-bad-collections.png > > > We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big > heap in a large clusters, handling large MapReduce jobs. The heap is large > (over 32GB) and 21.4% of it is wasted due to various suboptimal Java > collections, mostly maps and lists that are either empty or contain only one > element. In such under-populated collections considerable amount of memory is > still used by just the internal implementation objects. See the attached > excerpt from the jxray report for the details. If certain collections are > almost always empty, they should be initialized lazily. If others almost > always have just 1 or 2 elements, they should be initialized with the > appropriate initial capacity, which is much smaller than e.g. the default 16 > for HashMap and 10 for ArrayList. > Based on the attached report, we should do the following: > # {{FileSystemCounterGroup.map}} - initialize lazily > # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks > only have one or two attempts > # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity > # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it > contains one diagnostic message most of the time > # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to > use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@ha
[jira] [Created] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory
Misha Dmitriev created YARN-8872: Summary: Optimize collections used by Yarn JHS to reduce its memory Key: YARN-8872 URL: https://issues.apache.org/jira/browse/YARN-8872 Project: Hadoop YARN Issue Type: Improvement Components: yarn Reporter: Misha Dmitriev Assignee: Misha Dmitriev Attachments: jhs-bad-collections.png We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big heap in a large clusters, handling large MapReduce jobs. The heap is large (over 32GB) and 21.4% of it is wasted due to various suboptimal Java collections, mostly maps and lists that are either empty or contain only one element. In such under-populated collections considerable amount of memory is still used by just the internal implementation objects. See the attached excerpt from the jxray report for the details. If certain collections are almost always empty, they should be initialized lazily. If others almost always have just 1 or 2 elements, they should be initialized with the appropriate initial capacity, which is much smaller than e.g. the default 16 for HashMap and 10 for ArrayList. Based on the attached report, we should do the following: # {{FileSystemCounterGroup.map}} - initialize lazily # {{CompletedTask.attempts}} - initialize with capacity 2, given most tasks only have one or two attempts # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity 2 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it contains one diagnostic message most of the time. # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use the more wasteful LinkedList here) and initialize with capacity 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7386) Duplicate Strings in various places in Yarn memory
[ https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222767#comment-16222767 ] Misha Dmitriev commented on YARN-7386: -- [~rkanter] could you please look at the test failure above? I cannot reproduce it locally, and in any case my change, which is only about interning some strings, is the safest possible thing. So I suspect that this is just a flaky test. > Duplicate Strings in various places in Yarn memory > -- > > Key: YARN-7386 > URL: https://issues.apache.org/jira/browse/YARN-7386 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7386.01.patch, YARN-7386.02.patch > > > Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a > big cluster. The tool uncovered several sources of memory waste. One problem > is duplicate strings: > {code} > Total strings Unique strings Duplicate values > Overhead > 361,506 86,672 5,928 22,886K (7.6%) > {code} > They are spread across a number of locations. The biggest source of waste is > the following reference chain: > {code} > 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays: > ↖{j.u.HashMap}.values > ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment > ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext > ↖{java.util.concurrent.ConcurrentHashMap}.values > ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications > ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext > ↖Java Local@3ed9ef820 > (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor) > {code} > However, there are also many others. Mostly they are strings in proto buffer > or proto buffer builder objects. I plan to get rid of at least the worst > offenders by inserting String.intern() calls. String.intern() used to consume > memory in PermGen and was not very scalable up until about the early JDK 7 > versions, but has greatly improved since then, and I've used it many times > without any issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221206#comment-16221206 ] Misha Dmitriev commented on YARN-7320: -- [~rkanter] [~wangda] looks like in ~24 hrs Jenkins still hasn't processed my patch. Could you please check what's going on? > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Fix For: 3.0.0 > > Attachments: YARN-7320.01.addendum.patch, YARN-7320.01.patch, > YARN-7320.02.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-7320: - Attachment: YARN-7320.01.addendum.patch > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Fix For: 3.0.0 > > Attachments: YARN-7320.01.addendum.patch, YARN-7320.01.patch, > YARN-7320.02.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218017#comment-16218017 ] Misha Dmitriev commented on YARN-7320: -- [~wangda] I confirm the problem. Sorry about this. It is probably a missing null check in my added code or some such. Will submit a new patch shortly. > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Fix For: 3.0.0 > > Attachments: YARN-7320.01.patch, YARN-7320.02.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7386) Duplicate Strings in various places in Yarn memory
[ https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-7386: - Attachment: YARN-7386.02.patch > Duplicate Strings in various places in Yarn memory > -- > > Key: YARN-7386 > URL: https://issues.apache.org/jira/browse/YARN-7386 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7386.01.patch, YARN-7386.02.patch > > > Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a > big cluster. The tool uncovered several sources of memory waste. One problem > is duplicate strings: > {code} > Total strings Unique strings Duplicate values > Overhead > 361,506 86,672 5,928 22,886K (7.6%) > {code} > They are spread across a number of locations. The biggest source of waste is > the following reference chain: > {code} > 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays: > ↖{j.u.HashMap}.values > ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment > ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext > ↖{java.util.concurrent.ConcurrentHashMap}.values > ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications > ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext > ↖Java Local@3ed9ef820 > (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor) > {code} > However, there are also many others. Mostly they are strings in proto buffer > or proto buffer builder objects. I plan to get rid of at least the worst > offenders by inserting String.intern() calls. String.intern() used to consume > memory in PermGen and was not very scalable up until about the early JDK 7 > versions, but has greatly improved since then, and I've used it many times > without any issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7386) Duplicate Strings in various places in Yarn memory
[ https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217934#comment-16217934 ] Misha Dmitriev commented on YARN-7386: -- At least some test failures are real, and happen because .intern() is called on a null String. I will switch to using {{StringInterner.weakIntern()}}. > Duplicate Strings in various places in Yarn memory > -- > > Key: YARN-7386 > URL: https://issues.apache.org/jira/browse/YARN-7386 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7386.01.patch > > > Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a > big cluster. The tool uncovered several sources of memory waste. One problem > is duplicate strings: > {code} > Total strings Unique strings Duplicate values > Overhead > 361,506 86,672 5,928 22,886K (7.6%) > {code} > They are spread across a number of locations. The biggest source of waste is > the following reference chain: > {code} > 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays: > ↖{j.u.HashMap}.values > ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment > ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext > ↖{java.util.concurrent.ConcurrentHashMap}.values > ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications > ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext > ↖Java Local@3ed9ef820 > (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor) > {code} > However, there are also many others. Mostly they are strings in proto buffer > or proto buffer builder objects. I plan to get rid of at least the worst > offenders by inserting String.intern() calls. String.intern() used to consume > memory in PermGen and was not very scalable up until about the early JDK 7 > versions, but has greatly improved since then, and I've used it many times > without any issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7386) Duplicate Strings in various places in Yarn memory
[ https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-7386: - Attachment: YARN-7386.01.patch > Duplicate Strings in various places in Yarn memory > -- > > Key: YARN-7386 > URL: https://issues.apache.org/jira/browse/YARN-7386 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7386.01.patch > > > Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a > big cluster. The tool uncovered several sources of memory waste. One problem > is duplicate strings: > {code} > Total strings Unique strings Duplicate values > Overhead > 361,506 86,672 5,928 22,886K (7.6%) > {code} > They are spread across a number of locations. The biggest source of waste is > the following reference chain: > {code} > 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays: > ↖{j.u.HashMap}.values > ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment > ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext > ↖{java.util.concurrent.ConcurrentHashMap}.values > ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications > ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext > ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext > ↖Java Local@3ed9ef820 > (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor) > {code} > However, there are also many others. Mostly they are strings in proto buffer > or proto buffer builder objects. I plan to get rid of at least the worst > offenders by inserting String.intern() calls. String.intern() used to consume > memory in PermGen and was not very scalable up until about the early JDK 7 > versions, but has greatly improved since then, and I've used it many times > without any issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7386) Duplicate Strings in various places in Yarn memory
Misha Dmitriev created YARN-7386: Summary: Duplicate Strings in various places in Yarn memory Key: YARN-7386 URL: https://issues.apache.org/jira/browse/YARN-7386 Project: Hadoop YARN Issue Type: Improvement Reporter: Misha Dmitriev Assignee: Misha Dmitriev Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a big cluster. The tool uncovered several sources of memory waste. One problem is duplicate strings: {code} Total strings Unique strings Duplicate values Overhead 361,506 86,672 5,928 22,886K (7.6%) {code} They are spread across a number of locations. The biggest source of waste is the following reference chain: {code} 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays: ↖{j.u.HashMap}.values ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext ↖{java.util.concurrent.ConcurrentHashMap}.values ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext ↖Java Local@3ed9ef820 (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor) {code} However, there are also many others. Mostly they are strings in proto buffer or proto buffer builder objects. I plan to get rid of at least the worst offenders by inserting String.intern() calls. String.intern() used to consume memory in PermGen and was not very scalable up until about the early JDK 7 versions, but has greatly improved since then, and I've used it many times without any issues. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203955#comment-16203955 ] Misha Dmitriev commented on YARN-7320: -- Checkstyle is a little unhappy, but in the line in question I followed the same pattern that is used in other places in this file when calling methods on a Builder object, e.g. {code} new Builder(). foo(). bar(). baz(); {code} and not {code} new Builder(). foo(). bar(). baz(); {code} > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7320.01.patch, YARN-7320.02.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-7320: - Attachment: YARN-7320.02.patch Addressed checkstyle comments. > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7320.01.patch, YARN-7320.02.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
[ https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Misha Dmitriev updated YARN-7320: - Attachment: YARN-7320.01.patch > Duplicate LiteralByteStrings in > SystemCredentialsForAppsProto.credentialsForApp_ > > > Key: YARN-7320 > URL: https://issues.apache.org/jira/browse/YARN-7320 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Misha Dmitriev >Assignee: Misha Dmitriev > Attachments: YARN-7320.01.patch > > > Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN > Resource Manager running in a big cluster. The tool uncovered several sources > of memory waste. One problem, which results in wasting more than a quarter of > all memory, is a large number of duplicate {{LiteralByteString}} objects > coming from the following reference chain: > {code} > 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) > ↖com.google.protobuf.LiteralByteString.bytes > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ > ↖{j.u.ArrayList} > ↖j.u.Collections$UnmodifiableRandomAccessList.c > ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ > ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto > ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse > ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode > ... > {code} > That is, collectively reference chains that look as above hold in memory 5.4 > million {{LiteralByteString}} objects, but only ~22 thousand of these objects > are unique. Deduplicating these objects, e.g. using a Google Object Interner > instance, would save ~1GB of memory. > It looks like the main place where the above {{LiteralByteString}}s are > created and attached to the {{SystemCredentialsForAppsProto}} objects is in > {{NodeHeartbeatResponsePBImpl.java}}, method > {{addSystemCredentialsToProto()}}. Probably adding a call to an interner > there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_
Misha Dmitriev created YARN-7320: Summary: Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_ Key: YARN-7320 URL: https://issues.apache.org/jira/browse/YARN-7320 Project: Hadoop YARN Issue Type: Improvement Reporter: Misha Dmitriev Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN Resource Manager running in a big cluster. The tool uncovered several sources of memory waste. One problem, which results in wasting more than a quarter of all memory, is a large number of duplicate {{LiteralByteString}} objects coming from the following reference chain: {code} 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique) ↖com.google.protobuf.LiteralByteString.bytes ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_ ↖{j.u.ArrayList} ↖j.u.Collections$UnmodifiableRandomAccessList.c ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_ ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode ... {code} That is, collectively reference chains that look as above hold in memory 5.4 million {{LiteralByteString}} objects, but only ~22 thousand of these objects are unique. Deduplicating these objects, e.g. using a Google Object Interner instance, would save ~1GB of memory. It looks like the main place where the above {{LiteralByteString}}s are created and attached to the {{SystemCredentialsForAppsProto}} objects is in {{NodeHeartbeatResponsePBImpl.java}}, method {{addSystemCredentialsToProto()}}. Probably adding a call to an interner there will fix the problem. wi -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-7302) Configuration.updatingResource map should be initialized lazily
Misha Dmitriev created YARN-7302: Summary: Configuration.updatingResource map should be initialized lazily Key: YARN-7302 URL: https://issues.apache.org/jira/browse/YARN-7302 Project: Hadoop YARN Issue Type: Improvement Reporter: Misha Dmitriev Using jxray (www.jxray.com), I've analyzed a heap dump of YARN RM running in a big cluster. The tool uncovered several inefficiencies in the RM memory. It turns out that one of the biggest sources of memory waste, responsible for almost 1/4 of used memory, is empty ConcurrentHashMap instances in org.apache.hadoop.conf.Configuration.updatingResource: {code} 905,551K (24.0%): java.util.concurrent.ConcurrentHashMap: 22118 / 100% of empty 905,551K (24.0%) ↖org.apache.hadoop.conf.Configuration.updatingResource ↖{j.u.WeakHashMap}.keys ↖Java Static org.apache.hadoop.conf.Configuration.REGISTRY {code} That is, there are 22118 empty ConcurrentHashMaps here, and they collectively waste ~905MB of memory. This is caused by eager initialization of these maps. To address this problem, we should initialize them lazily. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org