[jira] [Commented] (YARN-3464) Race condition in LocalizerRunner kills localizer before localizing all resources

2015-04-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513122#comment-14513122
 ] 

Karthik Kambatla commented on YARN-3464:


Just committed to trunk and branch-2. 

Thanks [~zxu] for the patch, and [~jlowe] for your inputs.

 Race condition in LocalizerRunner kills localizer before localizing all 
 resources
 -

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3464) Race condition in LocalizerRunner kills localizer before localizing all resources

2015-04-26 Thread Karthik Kambatla (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Kambatla updated YARN-3464:
---
Summary: Race condition in LocalizerRunner kills localizer before 
localizing all resources  (was: Race condition in LocalizerRunner causes 
container localization timeout.)

 Race condition in LocalizerRunner kills localizer before localizing all 
 resources
 -

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3464) Race condition in LocalizerRunner causes container localization timeout.

2015-04-26 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513106#comment-14513106
 ] 

Karthik Kambatla commented on YARN-3464:


+1

 Race condition in LocalizerRunner causes container localization timeout.
 

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3464) Race condition in LocalizerRunner kills localizer before localizing all resources

2015-04-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513116#comment-14513116
 ] 

Hudson commented on YARN-3464:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7679 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7679/])
YARN-3464. Race condition in LocalizerRunner kills localizer before localizing 
all resources. (Zhihai Xu via kasha) (kasha: rev 
47279c3228185548ed09c36579b420225e4894f5)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/container/ContainerImpl.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/event/LocalizationEventType.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* hadoop-yarn-project/CHANGES.txt


 Race condition in LocalizerRunner kills localizer before localizing all 
 resources
 -

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3464) Race condition in LocalizerRunner kills localizer before localizing all resources

2015-04-26 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513247#comment-14513247
 ] 

Gera Shegalov commented on YARN-3464:
-

We might need to tweak checkstyle rules. There are a bunch of 80-column-limit 
violations that seem come from the import statements.

 Race condition in LocalizerRunner kills localizer before localizing all 
 resources
 -

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3458) CPU resource monitoring in Windows

2015-04-26 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-3458:
--
Attachment: YARN-3458-5.patch

Adding unit tests.

 CPU resource monitoring in Windows
 --

 Key: YARN-3458
 URL: https://issues.apache.org/jira/browse/YARN-3458
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: Windows
Reporter: Inigo Goiri
Assignee: Inigo Goiri
Priority: Minor
  Labels: containers, metrics, windows
 Attachments: YARN-3458-1.patch, YARN-3458-2.patch, YARN-3458-3.patch, 
 YARN-3458-4.patch, YARN-3458-5.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 The current implementation of getCpuUsagePercent() for 
 WindowsBasedProcessTree is left as unavailable. Attached a proposal of how to 
 do it. I reused the CpuTimeTracker using 1 jiffy=1ms.
 This was left open by YARN-3122.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3464) Race condition in LocalizerRunner kills localizer before localizing all resources

2015-04-26 Thread zhihai xu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513153#comment-14513153
 ] 

zhihai xu commented on YARN-3464:
-

thanks [~kasha] for the review and committing the patch, thanks [~jlowe] for 
the valuable feedback.

 Race condition in LocalizerRunner kills localizer before localizing all 
 resources
 -

 Key: YARN-3464
 URL: https://issues.apache.org/jira/browse/YARN-3464
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Fix For: 2.8.0

 Attachments: YARN-3464.000.patch, YARN-3464.001.patch


 Race condition in LocalizerRunner causes container localization timeout.
 Currently LocalizerRunner will kill the ContainerLocalizer when pending list 
 for LocalizerResourceRequestEvent is empty.
 {code}
   } else if (pending.isEmpty()) {
 action = LocalizerAction.DIE;
   }
 {code}
 If a LocalizerResourceRequestEvent is added after LocalizerRunner kill the 
 ContainerLocalizer due to empty pending list, this 
 LocalizerResourceRequestEvent will never be handled.
 Without ContainerLocalizer, LocalizerRunner#update will never be called.
 The container will stay at LOCALIZING state, until the container is killed by 
 AM due to TASK_TIMEOUT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3458) CPU resource monitoring in Windows

2015-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513199#comment-14513199
 ] 

Hadoop QA commented on YARN-3458:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 43s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 27s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 35s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 29s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   2m  0s | Tests passed in 
hadoop-yarn-common. |
| | |  43m 21s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728247/YARN-3458-5.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 8b69c82 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/7504/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7504/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7504/testReport/ |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7504/console |


This message was automatically generated.

 CPU resource monitoring in Windows
 --

 Key: YARN-3458
 URL: https://issues.apache.org/jira/browse/YARN-3458
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: Windows
Reporter: Inigo Goiri
Assignee: Inigo Goiri
Priority: Minor
  Labels: containers, metrics, windows
 Attachments: YARN-3458-1.patch, YARN-3458-2.patch, YARN-3458-3.patch, 
 YARN-3458-4.patch, YARN-3458-5.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 The current implementation of getCpuUsagePercent() for 
 WindowsBasedProcessTree is left as unavailable. Attached a proposal of how to 
 do it. I reused the CpuTimeTracker using 1 jiffy=1ms.
 This was left open by YARN-3122.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3484) Fix up yarn top shell code

2015-04-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513135#comment-14513135
 ] 

Allen Wittenauer commented on YARN-3484:


* variables that are local to a function should be declared local.  
* avoid using mixed case as per the shell programming guidelines
* yarnTopArgs is effectively a global.  It should either get renamed to 
YARN_foo or another  to not pollute the shell name space or another approach is 
process set_yarn_top_args as a subshell, reading its input directly to avoid 
the global entirely
* set_yarn_top_args should be hadoop_ something so as to not pollute the shell 
name space 
* nit: technically, TERM isn't guaranteed to be set on all OSes under all 
workable modes, since it is the login process' responsibility to set it.  
However, almost all modern systems do set it and it's fairly reliable. I think 
it's OK to leave the check, but I wanted to make this comment here for future 
readers in case they hit the situation where TERM wasn't set for their 
particular system.  Yes, that situation was thought about, but honestly, 
upgrade.

 Fix up yarn top shell code
 --

 Key: YARN-3484
 URL: https://issues.apache.org/jira/browse/YARN-3484
 Project: Hadoop YARN
  Issue Type: Bug
  Components: scripts
Affects Versions: 3.0.0
Reporter: Allen Wittenauer
Assignee: Varun Vasudev
 Attachments: YARN-3484.001.patch


 We need to do some work on yarn top's shell code.
 a) Just checking for TERM isn't good enough.  We really need to check the 
 return on tput, especially since the output will not be a number but an error 
 string which will likely blow up the java code in horrible ways.
 b) All the single bracket tests should be double brackets to force the bash 
 built-in.
 c) I'd think I'd rather see the shell portion in a function since it's rather 
 large.  This will allow for args, etc, to get local'ized and clean up the 
 case statement.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-3172) MR-279: Write a simple Java application

2015-04-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513367#comment-14513367
 ] 

Allen Wittenauer edited comment on YARN-3172 at 4/27/15 12:48 AM:
--

Welp, I'm committing this to trunk if test-patch says it is still good to go.


was (Author: aw):
Welp, I'm committing this to trunk.

 MR-279: Write a simple Java application
 ---

 Key: YARN-3172
 URL: https://issues.apache.org/jira/browse/YARN-3172
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Devaraj K
 Attachments: MAPREDUCE-2720.patch


 Currently for isolation purposes, many simple java applications run in 
 cluster with 1 map only job. (eg. Oozie). This is not really required with 
 nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
 A simple hadoop java app can be written which runs in the cluster in the user 
 space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3172) MR-279: Write a simple Java application

2015-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513391#comment-14513391
 ] 

Hadoop QA commented on YARN-3172:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:red}-1{color} | whitespace |   0m  0s | The patch has 8  line(s) that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 27s | There were no new checkstyle 
issues. |
| {color:blue}0{color} | shellcheck |   5m 27s | Shellcheck was not available. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   0m  0s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| | |  39m 47s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12523089/MAPREDUCE-2720.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle shellcheck |
| git revision | trunk / 884 |
| whitespace | 
https://builds.apache.org/job/PreCommit-YARN-Build/7505/artifact/patchprocess/whitespace.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7505/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7505/console |


This message was automatically generated.

 MR-279: Write a simple Java application
 ---

 Key: YARN-3172
 URL: https://issues.apache.org/jira/browse/YARN-3172
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Devaraj K
 Attachments: MAPREDUCE-2720.patch


 Currently for isolation purposes, many simple java applications run in 
 cluster with 1 map only job. (eg. Oozie). This is not really required with 
 nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
 A simple hadoop java app can be written which runs in the cluster in the user 
 space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-04-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3363:

Attachment: YARN-3363.001.patch

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-04-26 Thread zhihai xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhihai xu updated YARN-3363:

Attachment: (was: YARN-3363.001.patch)

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Attachments: YARN-3363.000.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3172) MR-279: Write a simple Java application

2015-04-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513367#comment-14513367
 ] 

Allen Wittenauer commented on YARN-3172:


Welp, I'm committing this to trunk.

 MR-279: Write a simple Java application
 ---

 Key: YARN-3172
 URL: https://issues.apache.org/jira/browse/YARN-3172
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Devaraj K
 Attachments: MAPREDUCE-2720.patch


 Currently for isolation purposes, many simple java applications run in 
 cluster with 1 map only job. (eg. Oozie). This is not really required with 
 nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
 A simple hadoop java app can be written which runs in the cluster in the user 
 space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3458) CPU resource monitoring in Windows

2015-04-26 Thread Inigo Goiri (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Inigo Goiri updated YARN-3458:
--
Attachment: YARN-3458-6.patch

Fixing checkstyle errors. Personally, I think this is very strict but I just 
followed it.

 CPU resource monitoring in Windows
 --

 Key: YARN-3458
 URL: https://issues.apache.org/jira/browse/YARN-3458
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: Windows
Reporter: Inigo Goiri
Assignee: Inigo Goiri
Priority: Minor
  Labels: containers, metrics, windows
 Attachments: YARN-3458-1.patch, YARN-3458-2.patch, YARN-3458-3.patch, 
 YARN-3458-4.patch, YARN-3458-5.patch, YARN-3458-6.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 The current implementation of getCpuUsagePercent() for 
 WindowsBasedProcessTree is left as unavailable. Attached a proposal of how to 
 do it. I reused the CpuTimeTracker using 1 jiffy=1ms.
 This was left open by YARN-3122.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3448) Add Rolling Time To Lives Level DB Plugin Capabilities

2015-04-26 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated YARN-3448:
--
Attachment: YARN-3448.13.patch

 Add Rolling Time To Lives Level DB Plugin Capabilities
 --

 Key: YARN-3448
 URL: https://issues.apache.org/jira/browse/YARN-3448
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Reporter: Jonathan Eagles
Assignee: Jonathan Eagles
 Attachments: YARN-3448.1.patch, YARN-3448.10.patch, 
 YARN-3448.12.patch, YARN-3448.13.patch, YARN-3448.2.patch, YARN-3448.3.patch, 
 YARN-3448.4.patch, YARN-3448.5.patch, YARN-3448.7.patch, YARN-3448.8.patch, 
 YARN-3448.9.patch


 For large applications, the majority of the time in LeveldbTimelineStore is 
 spent deleting old entities record at a time. An exclusive write lock is held 
 during the entire deletion phase which in practice can be hours. If we are to 
 relax some of the consistency constraints, other performance enhancing 
 techniques can be employed to maximize the throughput and minimize locking 
 time.
 Split the 5 sections of the leveldb database (domain, owner, start time, 
 entity, index) into 5 separate databases. This allows each database to 
 maximize the read cache effectiveness based on the unique usage patterns of 
 each database. With 5 separate databases each lookup is much faster. This can 
 also help with I/O to have the entity and index databases on separate disks.
 Rolling DBs for entity and index DBs. 99.9% of the data are in these two 
 sections 4:1 ration (index to entity) at least for tez. We replace DB record 
 removal with file system removal if we create a rolling set of databases that 
 age out and can be efficiently removed. To do this we must place a constraint 
 to always place an entity's events into it's correct rolling db instance 
 based on start time. This allows us to stitching the data back together while 
 reading and artificial paging.
 Relax the synchronous writes constraints. If we are willing to accept losing 
 some records that we not flushed in the operating system during a crash, we 
 can use async writes that can be much faster.
 Prefer Sequential writes. sequential writes can be several times faster than 
 random writes. Spend some small effort arranging the writes in such a way 
 that will trend towards sequential write performance over random write 
 performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3491) PublicLocalizer#addResource is too slow.

2015-04-26 Thread Gera Shegalov (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513509#comment-14513509
 ] 

Gera Shegalov commented on YARN-3491:
-

We should switch to {{io.nativeio.NativeIO.POSIX#getFstat}} as implementation 
in {{RawLocalFileSystem}} to get rid of shell-based implementation for 
FileStatus.

 PublicLocalizer#addResource is too slow.
 

 Key: YARN-3491
 URL: https://issues.apache.org/jira/browse/YARN-3491
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Affects Versions: 2.7.0
Reporter: zhihai xu
Assignee: zhihai xu
Priority: Critical
 Attachments: YARN-3491.000.patch, YARN-3491.001.patch, 
 YARN-3491.002.patch


 Based on the profiling, The bottleneck in PublicLocalizer#addResource is 
 getInitializedLocalDirs. getInitializedLocalDirs call checkLocalDir.
 checkLocalDir is very slow which takes about 10+ ms.
 The total delay will be approximately number of local dirs * 10+ ms.
 This delay will be added for each public resource localization.
 Because PublicLocalizer#addResource is slow, the thread pool can't be fully 
 utilized. Instead of doing public resource localization in 
 parallel(multithreading), public resource localization is serialized most of 
 the time.
 And also PublicLocalizer#addResource is running in Dispatcher thread, 
 So the Dispatcher thread will be blocked by PublicLocalizer#addResource for 
 long time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3363) add localization and container launch time to ContainerMetrics at NM to show these timing information for each active container.

2015-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513455#comment-14513455
 ] 

Hadoop QA commented on YARN-3363:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 22s | The applied patch generated  3 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m  1s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   5m 49s | Tests passed in 
hadoop-yarn-server-nodemanager. |
| | |  46m 30s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728288/YARN-3363.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1a2459b |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/7506/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-yarn-server-nodemanager test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7506/artifact/patchprocess/testrun_hadoop-yarn-server-nodemanager.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7506/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7506/console |


This message was automatically generated.

 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 

 Key: YARN-3363
 URL: https://issues.apache.org/jira/browse/YARN-3363
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: nodemanager
Reporter: zhihai xu
Assignee: zhihai xu
  Labels: metrics, supportability
 Attachments: YARN-3363.000.patch, YARN-3363.001.patch


 add localization and container launch time to ContainerMetrics at NM to show 
 these timing information for each active container.
 Currently ContainerMetrics has container's actual memory usage(YARN-2984),  
 actual CPU usage(YARN-3122), resource  and pid(YARN-3022). It will be better 
 to have localization and container launch time in ContainerMetrics for each 
 active container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3458) CPU resource monitoring in Windows

2015-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513463#comment-14513463
 ] 

Hadoop QA commented on YARN-3458:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 36s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 33s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 31s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   1m 23s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | yarn tests |   2m  6s | Tests passed in 
hadoop-yarn-common. |
| | |  43m 17s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728291/YARN-3458-6.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 1a2459b |
| hadoop-yarn-common test log | 
https://builds.apache.org/job/PreCommit-YARN-Build/7507/artifact/patchprocess/testrun_hadoop-yarn-common.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/7507/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf908.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/7507/console |


This message was automatically generated.

 CPU resource monitoring in Windows
 --

 Key: YARN-3458
 URL: https://issues.apache.org/jira/browse/YARN-3458
 Project: Hadoop YARN
  Issue Type: New Feature
  Components: nodemanager
Affects Versions: 2.7.0
 Environment: Windows
Reporter: Inigo Goiri
Assignee: Inigo Goiri
Priority: Minor
  Labels: containers, metrics, windows
 Attachments: YARN-3458-1.patch, YARN-3458-2.patch, YARN-3458-3.patch, 
 YARN-3458-4.patch, YARN-3458-5.patch, YARN-3458-6.patch

   Original Estimate: 168h
  Remaining Estimate: 168h

 The current implementation of getCpuUsagePercent() for 
 WindowsBasedProcessTree is left as unavailable. Attached a proposal of how to 
 do it. I reused the CpuTimeTracker using 1 jiffy=1ms.
 This was left open by YARN-3122.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-3172) MR-279: Write a simple Java application

2015-04-26 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513480#comment-14513480
 ] 

Allen Wittenauer commented on YARN-3172:


OK, looks like it needs to get rebased because this was before the 900th 
re-arrangement of the dir structure. :(

 MR-279: Write a simple Java application
 ---

 Key: YARN-3172
 URL: https://issues.apache.org/jira/browse/YARN-3172
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Devaraj K
 Attachments: MAPREDUCE-2720.patch


 Currently for isolation purposes, many simple java applications run in 
 cluster with 1 map only job. (eg. Oozie). This is not really required with 
 nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
 A simple hadoop java app can be written which runs in the cluster in the user 
 space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-3172) MR-279: Write a simple Java application

2015-04-26 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated YARN-3172:
---
Labels: newbie  (was: )

 MR-279: Write a simple Java application
 ---

 Key: YARN-3172
 URL: https://issues.apache.org/jira/browse/YARN-3172
 Project: Hadoop YARN
  Issue Type: New Feature
Reporter: Sharad Agarwal
Assignee: Devaraj K
  Labels: newbie
 Attachments: MAPREDUCE-2720.patch


 Currently for isolation purposes, many simple java applications run in 
 cluster with 1 map only job. (eg. Oozie). This is not really required with 
 nextgen hadoop (mrv2) and *non-MR* apps are first class and easy to write.
 A simple hadoop java app can be written which runs in the cluster in the user 
 space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)