[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-12 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-8872:
-
Attachment: YARN-8872.02.patch

> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: YARN-8872.01.patch, YARN-8872.02.patch, 
> jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-12 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648270#comment-16648270
 ] 

Misha Dmitriev commented on YARN-8872:
--

[~pbacsko] I think the situation here is the same as before. Both before and 
after this change, the {{size()}} method can never see {{map}} in a really 
inconsistent (half-constructed) state, because this object (a 
{{ConcurrentSkipListMap)}} is first fully constructed, and then the {{map}} 
reference is set to point to it. You are right that if {{findCounter()}} and 
{{size()}} run concurrently after that point, then the first method can keep 
adding objects to {{map}} and the second one may iterate a smaller number of 
objects (or none at all) and return a smaller size. But the same thing could 
happen before this change.

Note also that since this is a concurrent map implementation, iterating and 
adding/removing elements concurrently is safe (will not cause exceptions). 
According to the javadoc of {{ConcurrentSkipListMap.values()}}, "The view's 
{{iterator}} is a "weakly consistent" iterator that will never throw 
[{{ConcurrentModificationException}}|https://docs.oracle.com/javase/7/docs/api/java/util/ConcurrentModificationException.html],
 and guarantees to traverse elements as they existed upon construction of the 
iterator, and may (but is not guaranteed to) reflect any modifications 
subsequent to construction."

However, making {{size()}} synchronized will still make the code a little more 
predictable, at least in tests if nothing else. So I can make this change if 
you would like.

> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: YARN-8872.01.patch, jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-12 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648235#comment-16648235
 ] 

Misha Dmitriev commented on YARN-8872:
--

I would leave this decision to [~haibochen].

> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: YARN-8872.01.patch, jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-12 Thread Misha Dmitriev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16648225#comment-16648225
 ] 

Misha Dmitriev commented on YARN-8872:
--

Regarding the problems in Hadoop QA report above:
 # No tests are added because this is a performance improvement, no change in 
functionality
 # I believe there is no problem with synchronization in 
FileSystemCounterGroup.java. The {{map}} object is created lazily in the 
synchronized method {{findCounter()}}, so according to the Java Memory Model, 
once it's created, it's visible to all the code, both synchronized and 
unsynchronized. In other words, the unsynchronized method {{write()}} (line 281 
that findbugs complains about) will never think that {{map == null}} if {{map}} 
has actually been initialized. In other aspects it will work same as before.

> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: YARN-8872.01.patch, jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-8872:
-
Attachment: YARN-8872.01.patch

> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: YARN-8872.01.patch, jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-8872:
-
Description: 
We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
heap in a large clusters, handling large MapReduce jobs. The heap is large 
(over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
collections, mostly maps and lists that are either empty or contain only one 
element. In such under-populated collections considerable amount of memory is 
still used by just the internal implementation objects. See the attached 
excerpt from the jxray report for the details. If certain collections are 
almost always empty, they should be initialized lazily. If others almost always 
have just 1 or 2 elements, they should be initialized with the appropriate 
initial capacity of 1 or 2 (the default capacity is 16 for HashMap and 10 for 
ArrayList).

Based on the attached report, we should do the following:
 # {{FileSystemCounterGroup.map}} - initialize lazily
 # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
only have one or two attempts
 # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
contains one diagnostic message most of the time
 # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use 
the more wasteful LinkedList here) and initialize with capacity 1.

  was:
We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
heap in a large clusters, handling large MapReduce jobs. The heap is large 
(over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
collections, mostly maps and lists that are either empty or contain only one 
element. In such under-populated collections considerable amount of memory is 
still used by just the internal implementation objects. See the attached 
excerpt from the jxray report for the details. If certain collections are 
almost always empty, they should be initialized lazily. If others almost always 
have just 1 or 2 elements, they should be initialized with the appropriate 
initial capacity, which is much smaller than e.g. the default 16 for HashMap 
and 10 for ArrayList.

Based on the attached report, we should do the following:
 # {{FileSystemCounterGroup.map}} - initialize lazily
 # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
only have one or two attempts
 # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
contains one diagnostic message most of the time
 # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use 
the more wasteful LinkedList here) and initialize with capacity 1.


> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity of 1 or 2 (the default capacity is 16 for 
> HashMap and 10 for ArrayList).
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For ad

[jira] [Updated] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-11 Thread Misha Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-8872:
-
Description: 
We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
heap in a large clusters, handling large MapReduce jobs. The heap is large 
(over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
collections, mostly maps and lists that are either empty or contain only one 
element. In such under-populated collections considerable amount of memory is 
still used by just the internal implementation objects. See the attached 
excerpt from the jxray report for the details. If certain collections are 
almost always empty, they should be initialized lazily. If others almost always 
have just 1 or 2 elements, they should be initialized with the appropriate 
initial capacity, which is much smaller than e.g. the default 16 for HashMap 
and 10 for ArrayList.

Based on the attached report, we should do the following:
 # {{FileSystemCounterGroup.map}} - initialize lazily
 # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
only have one or two attempts
 # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
contains one diagnostic message most of the time
 # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use 
the more wasteful LinkedList here) and initialize with capacity 1.

  was:
We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
heap in a large clusters, handling large MapReduce jobs. The heap is large 
(over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
collections, mostly maps and lists that are either empty or contain only one 
element. In such under-populated collections considerable amount of memory is 
still used by just the internal implementation objects. See the attached 
excerpt from the jxray report for the details. If certain collections are 
almost always empty, they should be initialized lazily. If others almost always 
have just 1 or 2 elements, they should be initialized with the appropriate 
initial capacity, which is much smaller than e.g. the default 16 for HashMap 
and 10 for ArrayList.

Based on the attached report, we should do the following:
 # {{FileSystemCounterGroup.map}} - initialize lazily
 # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
only have one or two attempts
 # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity 2

 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
contains one diagnostic message most of the time.
 # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use 
the more wasteful LinkedList here) and initialize with capacity 1.


> Optimize collections used by Yarn JHS to reduce its memory
> --
>
> Key: YARN-8872
> URL: https://issues.apache.org/jira/browse/YARN-8872
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: jhs-bad-collections.png
>
>
> We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
> heap in a large clusters, handling large MapReduce jobs. The heap is large 
> (over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
> collections, mostly maps and lists that are either empty or contain only one 
> element. In such under-populated collections considerable amount of memory is 
> still used by just the internal implementation objects. See the attached 
> excerpt from the jxray report for the details. If certain collections are 
> almost always empty, they should be initialized lazily. If others almost 
> always have just 1 or 2 elements, they should be initialized with the 
> appropriate initial capacity, which is much smaller than e.g. the default 16 
> for HashMap and 10 for ArrayList.
> Based on the attached report, we should do the following:
>  # {{FileSystemCounterGroup.map}} - initialize lazily
>  # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
> only have one or two attempts
>  # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity
>  # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
> contains one diagnostic message most of the time
>  # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to 
> use the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@ha

[jira] [Created] (YARN-8872) Optimize collections used by Yarn JHS to reduce its memory

2018-10-11 Thread Misha Dmitriev (JIRA)
Misha Dmitriev created YARN-8872:


 Summary: Optimize collections used by Yarn JHS to reduce its memory
 Key: YARN-8872
 URL: https://issues.apache.org/jira/browse/YARN-8872
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: yarn
Reporter: Misha Dmitriev
Assignee: Misha Dmitriev
 Attachments: jhs-bad-collections.png

We analyzed, using jxray (www.jxray.com) a heap dump of JHS running with big 
heap in a large clusters, handling large MapReduce jobs. The heap is large 
(over 32GB) and 21.4% of it is wasted due to various suboptimal Java 
collections, mostly maps and lists that are either empty or contain only one 
element. In such under-populated collections considerable amount of memory is 
still used by just the internal implementation objects. See the attached 
excerpt from the jxray report for the details. If certain collections are 
almost always empty, they should be initialized lazily. If others almost always 
have just 1 or 2 elements, they should be initialized with the appropriate 
initial capacity, which is much smaller than e.g. the default 16 for HashMap 
and 10 for ArrayList.

Based on the attached report, we should do the following:
 # {{FileSystemCounterGroup.map}} - initialize lazily
 # {{CompletedTask.attempts}} - initialize with  capacity 2, given most tasks 
only have one or two attempts
 # {{JobHistoryParser$TaskInfo.attemptsMap}} - initialize with capacity 2

 # {{CompletedTaskAttempt.diagnostics}} - initialize with capacity 1 since it 
contains one diagnostic message most of the time.
 # {{CompletedTask.reportDiagnostics}} - switch to ArrayList (no reason to use 
the more wasteful LinkedList here) and initialize with capacity 1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7386) Duplicate Strings in various places in Yarn memory

2017-10-27 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222767#comment-16222767
 ] 

Misha Dmitriev commented on YARN-7386:
--

[~rkanter] could you please look at the test failure above? I cannot reproduce 
it locally, and in any case my change, which is only about interning some 
strings, is the safest possible thing. So I suspect that this is just a flaky 
test.

> Duplicate Strings in various places in Yarn memory
> --
>
> Key: YARN-7386
> URL: https://issues.apache.org/jira/browse/YARN-7386
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7386.01.patch, YARN-7386.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a 
> big cluster. The tool uncovered several sources of memory waste. One problem 
> is duplicate strings:
> {code}
> Total strings   Unique strings  Duplicate values   
> Overhead 
>  361,506   86,672  5,928  22,886K (7.6%)
> {code}
> They are spread across a number of locations. The biggest source of waste is 
> the following reference chain:
> {code}
> 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
> ↖{j.u.HashMap}.values
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
> ↖Java Local@3ed9ef820 
> (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
> {code}
> However, there are also many others. Mostly they are strings in proto buffer 
> or proto buffer builder objects. I plan to get rid of at least the worst 
> offenders by inserting String.intern() calls. String.intern() used to consume 
> memory in PermGen and was not very scalable up until about the early JDK 7 
> versions, but has greatly improved since then, and I've used it many times 
> without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-26 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221206#comment-16221206
 ] 

Misha Dmitriev commented on YARN-7320:
--

[~rkanter] [~wangda] looks like in ~24 hrs Jenkins still hasn't processed my 
patch. Could you please check what's going on?

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: YARN-7320.01.addendum.patch, YARN-7320.01.patch, 
> YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-25 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-7320:
-
Attachment: YARN-7320.01.addendum.patch

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: YARN-7320.01.addendum.patch, YARN-7320.01.patch, 
> YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-24 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218017#comment-16218017
 ] 

Misha Dmitriev commented on YARN-7320:
--

[~wangda] I confirm the problem. Sorry about this. It is probably a missing 
null check in my added code or some such. Will submit a new patch shortly.

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 3.0.0
>
> Attachments: YARN-7320.01.patch, YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7386) Duplicate Strings in various places in Yarn memory

2017-10-24 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-7386:
-
Attachment: YARN-7386.02.patch

> Duplicate Strings in various places in Yarn memory
> --
>
> Key: YARN-7386
> URL: https://issues.apache.org/jira/browse/YARN-7386
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7386.01.patch, YARN-7386.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a 
> big cluster. The tool uncovered several sources of memory waste. One problem 
> is duplicate strings:
> {code}
> Total strings   Unique strings  Duplicate values   
> Overhead 
>  361,506   86,672  5,928  22,886K (7.6%)
> {code}
> They are spread across a number of locations. The biggest source of waste is 
> the following reference chain:
> {code}
> 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
> ↖{j.u.HashMap}.values
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
> ↖Java Local@3ed9ef820 
> (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
> {code}
> However, there are also many others. Mostly they are strings in proto buffer 
> or proto buffer builder objects. I plan to get rid of at least the worst 
> offenders by inserting String.intern() calls. String.intern() used to consume 
> memory in PermGen and was not very scalable up until about the early JDK 7 
> versions, but has greatly improved since then, and I've used it many times 
> without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7386) Duplicate Strings in various places in Yarn memory

2017-10-24 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217934#comment-16217934
 ] 

Misha Dmitriev commented on YARN-7386:
--

At least some test failures are real, and happen because .intern() is called on 
a null String. I will switch to using {{StringInterner.weakIntern()}}.

> Duplicate Strings in various places in Yarn memory
> --
>
> Key: YARN-7386
> URL: https://issues.apache.org/jira/browse/YARN-7386
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7386.01.patch
>
>
> Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a 
> big cluster. The tool uncovered several sources of memory waste. One problem 
> is duplicate strings:
> {code}
> Total strings   Unique strings  Duplicate values   
> Overhead 
>  361,506   86,672  5,928  22,886K (7.6%)
> {code}
> They are spread across a number of locations. The biggest source of waste is 
> the following reference chain:
> {code}
> 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
> ↖{j.u.HashMap}.values
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
> ↖Java Local@3ed9ef820 
> (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
> {code}
> However, there are also many others. Mostly they are strings in proto buffer 
> or proto buffer builder objects. I plan to get rid of at least the worst 
> offenders by inserting String.intern() calls. String.intern() used to consume 
> memory in PermGen and was not very scalable up until about the early JDK 7 
> versions, but has greatly improved since then, and I've used it many times 
> without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7386) Duplicate Strings in various places in Yarn memory

2017-10-24 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-7386:
-
Attachment: YARN-7386.01.patch

> Duplicate Strings in various places in Yarn memory
> --
>
> Key: YARN-7386
> URL: https://issues.apache.org/jira/browse/YARN-7386
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7386.01.patch
>
>
> Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a 
> big cluster. The tool uncovered several sources of memory waste. One problem 
> is duplicate strings:
> {code}
> Total strings   Unique strings  Duplicate values   
> Overhead 
>  361,506   86,672  5,928  22,886K (7.6%)
> {code}
> They are spread across a number of locations. The biggest source of waste is 
> the following reference chain:
> {code}
> 7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
> ↖{j.u.HashMap}.values
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
> ↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
> ↖{java.util.concurrent.ConcurrentHashMap}.values
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
> ↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
> ↖Java Local@3ed9ef820 
> (org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
> {code}
> However, there are also many others. Mostly they are strings in proto buffer 
> or proto buffer builder objects. I plan to get rid of at least the worst 
> offenders by inserting String.intern() calls. String.intern() used to consume 
> memory in PermGen and was not very scalable up until about the early JDK 7 
> versions, but has greatly improved since then, and I've used it many times 
> without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7386) Duplicate Strings in various places in Yarn memory

2017-10-24 Thread Misha Dmitriev (JIRA)
Misha Dmitriev created YARN-7386:


 Summary: Duplicate Strings in various places in Yarn memory
 Key: YARN-7386
 URL: https://issues.apache.org/jira/browse/YARN-7386
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Misha Dmitriev
Assignee: Misha Dmitriev


Using jxray (www.jxray.com) I've analyzed a Yarn RM heap dump obtained in a big 
cluster. The tool uncovered several sources of memory waste. One problem is 
duplicate strings:

{code}
Total strings Unique strings  Duplicate values   Overhead 
 361,506 86,672  5,928  22,886K (7.6%)
{code}

They are spread across a number of locations. The biggest source of waste is 
the following reference chain:

{code}

7,416K (2.5%), 31292 / 62% dup strings (499 unique), 31292 dup backing arrays:
↖{j.u.HashMap}.values
↖org.apache.hadoop.yarn.api.records.impl.pb.ContainerLaunchContextPBImpl.environment
↖org.apache.hadoop.yarn.api.records.impl.pb.ApplicationSubmissionContextPBImpl.amContainer
↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl.submissionContext
↖{java.util.concurrent.ConcurrentHashMap}.values
↖org.apache.hadoop.yarn.server.resourcemanager.RMActiveServiceContext.applications
↖org.apache.hadoop.yarn.server.resourcemanager.RMContextImpl.activeServiceContext
↖org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor.rmContext
↖Java Local@3ed9ef820 
(org.apache.hadoop.yarn.server.resourcemanager.rmapp.monitor.RMAppLifetimeMonitor)
{code}

However, there are also many others. Mostly they are strings in proto buffer or 
proto buffer builder objects. I plan to get rid of at least the worst offenders 
by inserting String.intern() calls. String.intern() used to consume memory in 
PermGen and was not very scalable up until about the early JDK 7 versions, but 
has greatly improved since then, and I've used it many times without any issues.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-13 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16203955#comment-16203955
 ] 

Misha Dmitriev commented on YARN-7320:
--

Checkstyle is a little unhappy, but in the line in question I followed the same 
pattern that is used in other places in this file when calling methods on a 
Builder object, e.g.

{code}
new Builder().
  foo().
  bar().
  baz();
{code}

and not

{code}
new Builder().
  foo().
bar().
  baz();
{code}

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7320.01.patch, YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-12 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-7320:
-
Attachment: YARN-7320.02.patch

Addressed checkstyle comments.

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7320.01.patch, YARN-7320.02.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-12 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated YARN-7320:
-
Attachment: YARN-7320.01.patch

> Duplicate LiteralByteStrings in 
> SystemCredentialsForAppsProto.credentialsForApp_
> 
>
> Key: YARN-7320
> URL: https://issues.apache.org/jira/browse/YARN-7320
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: YARN-7320.01.patch
>
>
> Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN 
> Resource Manager running in a big cluster. The tool uncovered several sources 
> of memory waste. One problem, which results in wasting more than a quarter of 
> all memory, is a large number of duplicate {{LiteralByteString}} objects 
> coming from the following reference chain:
> {code}
> 1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
> ↖com.google.protobuf.LiteralByteString.bytes
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
> ↖{j.u.ArrayList}
> ↖j.u.Collections$UnmodifiableRandomAccessList.c
> ↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
> ↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
> ↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
> ↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
> ...
> {code}
> That is, collectively reference chains that look as above hold in memory 5.4 
> million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
> are unique. Deduplicating these objects, e.g. using a Google Object Interner 
> instance, would save ~1GB of memory.
> It looks like the main place where the above {{LiteralByteString}}s are 
> created and attached to the {{SystemCredentialsForAppsProto}} objects is in 
> {{NodeHeartbeatResponsePBImpl.java}}, method 
> {{addSystemCredentialsToProto()}}. Probably adding a call to an interner 
> there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7320) Duplicate LiteralByteStrings in SystemCredentialsForAppsProto.credentialsForApp_

2017-10-12 Thread Misha Dmitriev (JIRA)
Misha Dmitriev created YARN-7320:


 Summary: Duplicate LiteralByteStrings in 
SystemCredentialsForAppsProto.credentialsForApp_
 Key: YARN-7320
 URL: https://issues.apache.org/jira/browse/YARN-7320
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Misha Dmitriev


Using jxray (www.jxray.com) I've analyzed several heap dumps from YARN Resource 
Manager running in a big cluster. The tool uncovered several sources of memory 
waste. One problem, which results in wasting more than a quarter of all memory, 
is a large number of duplicate {{LiteralByteString}} objects coming from the 
following reference chain:

{code}
1,011,810K (26.9%): byte[]: 5416705 / 100% dup arrays (22108 unique)
↖com.google.protobuf.LiteralByteString.bytes
↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$.credentialsForApp_
↖{j.u.ArrayList}
↖j.u.Collections$UnmodifiableRandomAccessList.c
↖org.apache.hadoop.yarn.proto.YarnServerCommonServiceProtos$NodeHeartbeatResponseProto.systemCredentialsForApps_
↖org.apache.hadoop.yarn.server.api.protocolrecords.impl.pb.NodeHeartbeatResponsePBImpl.proto
↖org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl.latestNodeHeartBeatResponse
↖org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerNode.rmNode
...
{code}

That is, collectively reference chains that look as above hold in memory 5.4 
million {{LiteralByteString}} objects, but only ~22 thousand of these objects 
are unique. Deduplicating these objects, e.g. using a Google Object Interner 
instance, would save ~1GB of memory.

It looks like the main place where the above {{LiteralByteString}}s are created 
and attached to the {{SystemCredentialsForAppsProto}} objects is in 
{{NodeHeartbeatResponsePBImpl.java}}, method {{addSystemCredentialsToProto()}}. 
Probably adding a call to an interner there will fix the problem. wi 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7302) Configuration.updatingResource map should be initialized lazily

2017-10-09 Thread Misha Dmitriev (JIRA)
Misha Dmitriev created YARN-7302:


 Summary: Configuration.updatingResource map should be initialized 
lazily
 Key: YARN-7302
 URL: https://issues.apache.org/jira/browse/YARN-7302
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Misha Dmitriev


Using jxray (www.jxray.com), I've analyzed a heap dump of YARN RM running in a 
big cluster. The tool uncovered several inefficiencies in the RM memory. It 
turns out that one of the biggest sources of memory waste, responsible for 
almost 1/4 of used memory, is empty ConcurrentHashMap instances in 
org.apache.hadoop.conf.Configuration.updatingResource:

{code}
905,551K (24.0%): java.util.concurrent.ConcurrentHashMap: 22118 / 100% of empty 
905,551K (24.0%)
↖org.apache.hadoop.conf.Configuration.updatingResource
↖{j.u.WeakHashMap}.keys
↖Java Static org.apache.hadoop.conf.Configuration.REGISTRY
{code}

That is, there are 22118 empty ConcurrentHashMaps here, and they collectively 
waste ~905MB of memory. This is caused by eager initialization of these maps. 
To address this problem, we should initialize them lazily.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org