[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-31 Thread Gera Shegalov (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17169137#comment-17169137
 ] 

Gera Shegalov commented on YARN-1529:
-

I am glad this is still useful. Thanks for committing, [~Jim_Brennan] [~epayne]!

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-1529-branch-2.10.001.patch, YARN-1529.005.patch, 
> YARN-1529.006.patch, YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168213#comment-17168213
 ] 

Jim Brennan commented on YARN-1529:
---

Thanks [~epayne]!

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.4.0, 3.3.1, 3.1.5
>
> Attachments: YARN-1529-branch-2.10.001.patch, YARN-1529.005.patch, 
> YARN-1529.006.patch, YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168212#comment-17168212
 ] 

Hadoop QA commented on YARN-1529:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
45s{color} | {color:green} branch-2.10 passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  3m 
57s{color} | {color:red} hadoop-yarn in branch-2.10 failed with JDK Oracle 
Corporation-1.7.0_95-b00. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
7s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} branch-2.10 passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-yarn-api in branch-2.10 failed with JDK Oracle 
Corporation-1.7.0_95-b00. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
29s{color} | {color:red} hadoop-yarn-server-nodemanager in branch-2.10 failed 
with JDK Oracle Corporation-1.7.0_95-b00. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
16s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
55s{color} | {color:green} branch-2.10 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
46s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
20s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 14s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 535 unchanged - 0 fixed = 537 total (was 535) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
37s{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager-jdkOracleCorporation-1.7.0_95-b00
 with JDK Oracle Corporation-1.7.0_95-b00 generated 2 new + 0 unchanged - 0 
fixed = 2 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
|| || 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168204#comment-17168204
 ] 

Hadoop QA commented on YARN-1529:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
58s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
50s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
15s{color} | {color:green} branch-2.10 passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m  
4s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
20s{color} | {color:green} branch-2.10 passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2.10 passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
13s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} branch-2.10 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  5s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 535 unchanged - 0 fixed = 537 total (was 535) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed with JDK Oracle 
Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK Private 
Build-1.8.0_252-8u252-b09-1~16.04-b09 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
46s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 
16s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168161#comment-17168161
 ] 

Jim Brennan commented on YARN-1529:
---

[~epayne] I have uploaded a patch for branch-2.10.  Incidentally, the 
compilation error was related to the fact that [YARN-7677] has not been pulled 
back to branch-2.10.  We might want to consider doing that.


> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529-branch-2.10.001.patch, YARN-1529.005.patch, 
> YARN-1529.006.patch, YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168136#comment-17168136
 ] 

Jim Brennan commented on YARN-1529:
---

Thanks [~epayne]!  I will put up a patch for branch-2.10.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168122#comment-17168122
 ] 

Eric Payne commented on YARN-1529:
--

I don't know why 2 pre-commit builds were kicked off. The first was fine but 
the second one had several unit test failures. Those unit tests all succeed for 
me locally.

I have committed to branch-3.1 to trunk.

However, although there were no merge conflicts in backporting to 2.10, the 
following code does not compile:
{code:title=ContainerLaunch#sanitizeEnv}
addToEnvMap(environment, nmVars, Environment.LOCALIZATION_COUNTERS.name(),
 container.localizationCountersAsString());
{code}

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-30 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168064#comment-17168064
 ] 

Hudson commented on YARN-1529:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #18481 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/18481/])
YARN-1529: Add Localization overhead metrics to NM. Contributed by (ericp: rev 
e0c9653166df48a47267dbc81d124ab78267e039)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerResourceLocalizedEvent.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/LocalizedResource.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/webapp/MockContainer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/ContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/launcher/TestContainerLaunch.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/metrics/NodeManagerMetrics.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/TestContainerManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java


> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167554#comment-17167554
 ] 

Hadoop QA commented on YARN-1529:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
44s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m  
1s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 58s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 468 unchanged - 0 fixed = 470 total (was 468) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 53s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
3s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 27m 35s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.TestDockerClient
 |
|   | 
hadoop.yarn.server.nodemanager.containermanager.monitor.TestContainersMonitor |
|   | hadoop.yarn.server.nodemanager.webapp.TestNMWebServices |
|   | hadoop.yarn.server.nodemanager.TestNodeManagerShutdown |
|   | hadoop.yarn.server.nodemanager.TestLinuxContainerExecutorWithMocks |
|   | 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167547#comment-17167547
 ] 

Hadoop QA commented on YARN-1529:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m  
7s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 12s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
39s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 10m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 43s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 2 new + 469 unchanged - 0 fixed = 471 total (was 469) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
3s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 22m 
58s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}128m 27s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/23/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-1529 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008719/YARN-1529.006.patch |
| Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite 
unit shadedclient findbugs checkstyle |
| uname | Linux f37ddd218c00 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167516#comment-17167516
 ] 

Jim Brennan commented on YARN-1529:
---

[~epayne] I trust you to resolve the minor conflicts in the other branches.

 

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167513#comment-17167513
 ] 

Eric Payne commented on YARN-1529:
--

[~Jim_Brennan], one thing I should point out is that the backports aren't 100% 
clean, but the conflicts are fairly minor. If you trust me to resolve them 
myself, I can just do it as part of the commit process. If you'd prefer, you 
can create separate patches for branch-3.2 and branch-2.10.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167510#comment-17167510
 ] 

Eric Payne commented on YARN-1529:
--

Thanks for the updated patch, [~Jim_Brennan]. The changes LGTM,.

+1

I will commit tomorrow if all looks well with the pre-commit build and there 
are not objections.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.006.patch, 
> YARN-1529.v01.patch, YARN-1529.v02.patch, YARN-1529.v03.patch, 
> YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167499#comment-17167499
 ] 

Jim Brennan commented on YARN-1529:
---

[~epayne], for the checkstyle issues:
{quote}./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationConstants.java:227:
 /**: First sentence should end with a period. [JavadocStyle]
{quote}
I did not fix this because the added code follows the convention for the file.
{quote}./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/Container.java:145:
 * \{@link 
org.apache.hadoop.yarn.api.ApplicationConstants.Environment#LOCALIZATION_COUNTERS}
 : Line is longer than 80 characters (found 94). [LineLength]
{quote}
I did not fix this because it would require breaking up the link string.
{quote}./hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java:104:
 private static enum LocalizationCounter {:11: Redundant 'static' modifier. 
[RedundantModifier]
{quote}
I fixed this one.

I also fixed the unit test and whitespace issues.  I am putting up patch 006 
with these fixes.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.v01.patch, 
> YARN-1529.v02.patch, YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167495#comment-17167495
 ] 

Hadoop QA commented on YARN-1529:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
55s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  3m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 58s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
33s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
35s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
51s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 29s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 373 unchanged - 0 fixed = 376 total (was 373) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
4s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 41s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
49s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}122m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/PreCommit-YARN-Build/19/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-1529 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167469#comment-17167469
 ] 

Eric Payne commented on YARN-1529:
--

bq. The TestContainerLaunch failures look like they are relevant. I will 
investigate.
Thanks [~Jim_Brennan]. Also, as long as you're at it, can you please look at 
the whitespace and checkstyle warnings.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.v01.patch, 
> YARN-1529.v02.patch, YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167465#comment-17167465
 ] 

Jim Brennan commented on YARN-1529:
---

The  TestContainerLaunch failures look like they are relevant.  I will 
investigate.


> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Jim Brennan
>Priority: Major
> Attachments: YARN-1529.005.patch, YARN-1529.v01.patch, 
> YARN-1529.v02.patch, YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-07-29 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167429#comment-17167429
 ] 

Hadoop QA commented on YARN-1529:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 50s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
31s{color} | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
28s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 32s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 3 new + 374 unchanged - 0 fixed = 377 total (was 374) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 25s{color} 
| {color:red} hadoop-yarn-server-nodemanager in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
51s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}113m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.nodemanager.containermanager.launcher.TestContainerLaunch |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-YARN-Build/26321/artifact/out/Dockerfile
 |
| JIRA Issue | YARN-1529 |
| JIRA Patch URL | 

[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2020-04-23 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17090843#comment-17090843
 ] 

Eric Payne commented on YARN-1529:
--

bq.  It might be worth splitting the patch so the less controversial NM-level 
metrics can go in earlier and we can discuss the per-container metrics API in 
another. 
+1 for this idea. We would like to see the NM metrics piece integrated.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Chris Trezzo
>Priority: Major
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-08-18 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427301#comment-15427301
 ] 

Jason Lowe commented on YARN-1529:
--

bq. One comment that I have is we are adding a new API, albeit a small one, for 
YARN application developers.

That's a great point, and actually I'd be perfectly happy if this JIRA simply 
added the NM-level metric source and skipped the container API part for now.  
If we're moving towards doing this via the ATS anyway, we may not want/need the 
env variable API.  It might be worth splitting the patch so the less 
controversial NM-level metrics can go in earlier and we can discuss the 
per-container metrics API in another.  If the consensus is that this patch 
should include the per-container metrics API via the container env as well then 
I'm OK with that too.  I also agree that hiding the implementation details of 
that API would be important, whether that's in this JIRA or another.

Either way the patch needs an update, and please feel free to do so.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Chris Trezzo
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-08-18 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15427222#comment-15427222
 ] 

Chris Trezzo commented on YARN-1529:


Thanks [~jlowe] for the rebased patch! I agree that it would be nice to not tie 
these localization metrics to ATS so that more people can leverage them earlier.

One comment that I have is we are adding a new API, albeit a small one, for 
YARN application developers. This API is the serialized data we put into the 
environment variable (LOCALIZATION_COUNTERS) to communicate the localization 
statistics to the application-level container. Currently, if a YARN developer 
wants to leverage these metrics, they have to figure out how information is 
serialized into this env var and hope it doesn't change. What do you think 
about adding a small class/method that defines this a little more formally and 
contains the deserialization logic? That way if another application, let's say 
TEZ, wants to leverage this data, they can just call the new deserialize method.

If you think this is a good idea, I can post another patch with the added 
class. Thanks!

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Chris Trezzo
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch, YARN-1529.v04.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-07-22 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389871#comment-15389871
 ] 

Chris Trezzo commented on YARN-1529:


I can take a crack at rebasing this patch and adjusting it so that it writes to 
ATS.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-07-22 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389855#comment-15389855
 ] 

Chris Trezzo commented on YARN-1529:


Scratch that last comment. If metrics are written to ATS the application 
doesn't have to be aware of it at all. It is just surfaced through the ATS UI.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-07-22 Thread Chris Trezzo (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389841#comment-15389841
 ] 

Chris Trezzo commented on YARN-1529:


[~mingma] that makes total sense. [~sjlee0] Is there anything that would 
prevent an application-level process running in a container from querying ATS 
for framework level metrics about the container itself while the container is 
running?

As a side node, one interesting thing about these particular metrics is as they 
stand now, once the container is up and running they do not change (i.e. all 
localization for the container is done).

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2016-07-21 Thread Ming Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15388826#comment-15388826
 ] 

Ming Ma commented on YARN-1529:
---

With ATS v2 in trunk and other frameworks such as Tez wanting such feature, I 
wonder if there is a way to implement such feature completely in YARN (without 
MR change MAPREDUCE-5696) by having YARN write framework independent 
application metrics directly to ATS.

> Add Localization overhead metrics to NM
> ---
>
> Key: YARN-1529
> URL: https://issues.apache.org/jira/browse/YARN-1529
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Gera Shegalov
>Assignee: Gera Shegalov
> Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
> YARN-1529.v03.patch
>
>
> Users are often unaware of localization cost that their jobs incur. To 
> measure effectiveness of localization caches it is necessary to expose the 
> overhead in the form of metrics.
> We propose addition of the following metrics to NodeManagerMetrics.
> When a container is about to launch, its set of LocalResources has to be 
> fetched from a central location, typically on HDFS, that results in a number 
> of download requests for the files missing in caches.
> LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
> misses.
> LocalizedFilesCached: total localization requests that were served from local 
> caches. Cache hits.
> LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
> LocalizedBytesCached: total bytes satisfied from local caches.
> Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
> were served out of cache: ratio = 100 * caches / (caches + misses)
> LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
> to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2015-01-06 Thread Andy Schlaikjer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267068#comment-14267068
 ] 

Andy Schlaikjer commented on YARN-1529:
---

Any update on this? These new metrics look valuable.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
 YARN-1529.v03.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2015-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267075#comment-14267075
 ] 

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621292/YARN-1529.v03.patch
  against trunk revision 788ee35.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/6260//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
 YARN-1529.v03.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2014-01-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861403#comment-13861403
 ] 

Hadoop QA commented on YARN-1529:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621292/YARN-1529.v03.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2791//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2791//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch, 
 YARN-1529.v03.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861167#comment-13861167
 ] 

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621228/YARN-1529.v02.patch
  against trunk revision .

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2783//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2014-01-02 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861211#comment-13861211
 ] 

Hadoop QA commented on YARN-1529:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12621247/YARN-1529.v02.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 5 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2787//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2787//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch, YARN-1529.v02.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2013-12-24 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856421#comment-13856421
 ] 

Hitesh Shah commented on YARN-1529:
---

bq. I am preparing a patch that exposes this information MR counters for MRv2. 
Is there a better way to achieve this in an application-agnostic manner such 
that it is visible in the webUI.
Also, is there an MR jira for the per job stats? Furthermore, shouldn't the per 
application implementation be such that all applications on YARN can leverage 
it as compared to just an MR specific implementation. 

bq. Currently all resource types are lumped together. We can have a discussion 
whether it's helpful to expose a finer break down at the NM level or the 
app-level.
Is there any comment/doc that describes the overall plan/approach that you are 
trying to implement? I am not sure how these metrics translate into any 
actionable insights for a cluster admin to act upon. 



 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2013-12-24 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856439#comment-13856439
 ] 

Gera Shegalov commented on YARN-1529:
-

bq. Also, is there an MR jira for the per job stats? 

I linked MAPREDUCE-5696 to this JIRA. 

bq. Furthermore, shouldn't the per application implementation be such that all 
applications on YARN can leverage it as compared to just an MR specific 
implementation.

Ideally, yes. As stated in the previous comment, open to suggestions. As of now 
there seems to be no common application metrics. I expose localization cost as 
an environment variable (LOCALIZATION_COUNTERS) in MAPREDUCE-5696 to 
containers.  MR containers add them as TaskCounter. We can also include it in 
MRAppMetrics. Other applications can use this variable in some other way.

bq. Is there any comment/doc that describes the overall plan/approach that you 
are trying to implement?

The background is in YARN-1492

bq.  I am not sure how these metrics translate into any actionable insights for 
a cluster admin to act upon.

Users will see how localization overhead (shipping computation to data) 
compares to their container execution times. It should help reconsider 
build/packaging strategies encourage making better use of DistributedCache, 
etc. Admins will be able to better dissect network utilization in the cluster.  
Our particular goal is to clearly demo usefulness of YARN-1492.  


 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2013-12-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856014#comment-13856014
 ] 

Hadoop QA commented on YARN-1529:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12620280/YARN-1529.v01.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/2718//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/2718//console

This message is automatically generated.

 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2013-12-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856023#comment-13856023
 ] 

Hitesh Shah commented on YARN-1529:
---

[~jira.shegalov] Could you add more details on how users should interpret these 
new metrics? Does the cache ratio account for the local resource visibility 
i.e. public cache misses are more important than cache misses for application 
visibility? I assume the LocalizationDownloadNanos is an average per 
container? How does an average help when there are numerous application types 
with diff no. of resources and each container facing a different cache hit 
ratio? Is this something which needs to be augmented into the container status 
and not a general NM metric? For that matter, what is the better option - 
trackinglocalization metrics on the NM level or tracking them on a per 
container/per app level? 

Further thoughts:
 - Shouldn't there be a metric that tracks the actual size of the local 
resource cache on disk?
 - How are public/private/application caches being considered?
 - What about different resource types - file/archive/pattern? 





 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (YARN-1529) Add Localization overhead metrics to NM

2013-12-23 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856042#comment-13856042
 ] 

Gera Shegalov commented on YARN-1529:
-

Hi [~hitesh] thanks for chiming in!

 Does the cache ratio account for the local resource visibility i.e. public 
 cache misses are more important than cache misses for application visibility?

The current patch does not differentiate between cache visibilities. I am open 
to suggestions whether a finer breakdown for cache misses can be helpful. The 
goal of this and a follow-up MAPREDUCE is to raise awareness at the aggregate 
leve that shipping computation to data is not free

  I assume the LocalizationDownloadNanos is an average per container? How 
 does an average help when there are numerous application types with diff no. 
 of resources and each container facing a different cache hit ratio? Is this 
 something which needs to be augmented into the container status and not a 
 general NM metric? 

LocalizationDownloadNanos is a total sum of container launch delay due to 
localization. An average can be obtained as {code}LocalizationDownloadNanos / 
ContainersLaunched{code}.

 For that matter, what is the better option - trackinglocalization metrics on 
 the NM level or tracking them on a per container/per app level?

I am preparing a patch that exposes this information MR counters for MRv2. Is 
there a better way to achieve this in an application-agnostic manner such that 
it is visible in the webUI.

 Shouldn't there be a metric that tracks the actual size of the local resource 
 cache on disk?
This is a very good idea in my opinion.

 What about different resource types - file/archive/pattern?
Currently all resource types are lumped together. We can have a discussion 
whether it's helpful to expose a finer break down at the NM level or the 
app-level.




 Add Localization overhead metrics to NM
 ---

 Key: YARN-1529
 URL: https://issues.apache.org/jira/browse/YARN-1529
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Gera Shegalov
Assignee: Gera Shegalov
 Attachments: YARN-1529.v01.patch


 Users are often unaware of localization cost that their jobs incur. To 
 measure effectiveness of localization caches it is necessary to expose the 
 overhead in the form of metrics.
 We propose addition of the following metrics to NodeManagerMetrics.
 When a container is about to launch, its set of LocalResources has to be 
 fetched from a central location, typically on HDFS, that results in a number 
 of download requests for the files missing in caches.
 LocalizedFilesMissed: total files (requests) downloaded from DFS. Cache 
 misses.
 LocalizedFilesCached: total localization requests that were served from local 
 caches. Cache hits.
 LocalizedBytesMissed: total bytes downloaded from DFS due to cache misses.
 LocalizedBytesCached: total bytes satisfied from local caches.
 Localized(Files|Bytes)CachedRatio: percentage of localized (files|bytes) that 
 were served out of cache: ratio = 100 * caches / (caches + misses)
 LocalizationDownloadNanos: total elapsed time in nanoseconds for a container 
 to go from ResourceRequestTransition to LocalizedTransition



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)