date:20130227

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589296#comment-13589296
 ] 

Bikas Saha commented on YARN-417:
-

I think if ContainerExitCodes needs to be added then it should be its own jira 
because its an addition to the YARN API and should be kept distinct from this 
jira. This jira could be marked dependent on that jira. Its also missing out of 
memory, preemption from what I see in the patch.

ContainerRequest is something thats tightly coupled with the AMRMClient and 
hence I had put it inside AMRMClient. Its expected to be used in other places 
and thats why its public.

The helper function would have helped because containers contain information 
set by 2 entities - RM & NM. And its "status" is a combination of 
containerState and containerExitCode. e.g. state could be running in which case 
exit codes dont matter. The state could be completed in which case the exit 
code can tell us where it was killed or not. The exit code may not be enough 
because the RM could preempt a container before its launched and hence may not 
have a real exit code. Exit codes are not portable across platforms (eg. 
Windows and Linux). The helper function lets the library hide all this and 
present a single status value for the user to look at. Whether the container is 
allocated, running, completed_with_success, killed, preempted, out of memory 
etc. At some point this could move into YARN but as it evolves, the library 
might be a good place to house it. Does that help clarify its utility?

Why is client.start() being called in init? client.stop() is being called in 
stop().
{code}
+  @Override
+  public void init(Configuration conf) {
+super.init(conf);
+client.init(conf);
+client.start();
+  }
{code}

Not waiting for the thread to join()? Why interrupt()? Thread needs to be 
stopped first so that it stops calling into the client. or else it can call 
into a client that has already stopped.
{code}
+  @Override
+  public void stop() {
+client.stop();
+keepRunning = false;
+thread.interrupt();
+  }
{code}

I am wary of calling back on the heartbeat thread itself. If you notice the 
interface patch I had uploaded, I had  left some comments on moving this to its 
own thread. This is important because the callback code can be arbitrary and 
may not complete in time for our heartbeat, specially with 1000's of 
containers. We cannot let our heartbeat rate be dependent on app code 
performance.

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, 
> YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589185#comment-13589185
 ] 

Hadoop QA commented on YARN-417:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571354/YARN-417-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: https://builds.apache.org/job/PreCommit-YARN-Build/445//console

This message is automatically generated.

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, 
> YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-323) Yarn CLI commands prints classpath

2013-02-27 Thread Abhishek Kapoor (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Kapoor updated YARN-323:
-

Priority: Trivial  (was: Minor)

> Yarn CLI commands prints classpath
> --
>
> Key: YARN-323
> URL: https://issues.apache.org/jira/browse/YARN-323
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Nishan Shetty
>Priority: Trivial
>
> Execute ./yarn commands. It will print classpath in console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589182#comment-13589182
 ] 

Sandy Ryza commented on YARN-417:
-

Uploaded a second cut with what was discussed above.

One more thought: would it make sense to take ContainerRequest out of 
AMRMClient as its now used in places where AMRMClient is not?

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, 
> YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-323) Yarn CLI commands prints classpath

2013-02-27 Thread Abhishek Kapoor (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589183#comment-13589183
 ] 

Abhishek Kapoor commented on YARN-323:
--

I dont see the classpath being printed on console.
Please confirm, or close the issue.

Thanks
Abhishek

> Yarn CLI commands prints classpath
> --
>
> Key: YARN-323
> URL: https://issues.apache.org/jira/browse/YARN-323
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.0.1-alpha
>Reporter: Nishan Shetty
>Priority: Minor
>
> Execute ./yarn commands. It will print classpath in console

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-417:


Attachment: YARN-417-1.patch

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417-1.patch, YARN-417.patch, YarnAppMaster.java, 
> YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589177#comment-13589177
 ] 

Sandy Ryza commented on YARN-417:
-

That's sounds right Chris.  Will include that in the class's doc.

I've thought a little more about the ContainerCompletionReason, and I'm not 
sure it's necessary, as there are already constants in YarnConfiguration for 
the special exit codes, and there are only two, ABORTED_CONTAINER_EXIT_STATUS 
and DISK_FAILED.  As these don't really have to do with configuration, it might 
make sense to move them to a ContainerExitCodes class, and just point to that 
class in the doc for ContainerStatus#getExitCode and 
CallbackHandler#onContainersCompleted


> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417.patch, YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589175#comment-13589175
 ] 

Hadoop QA commented on YARN-376:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571301/YARN-376.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/444//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/444//console

This message is automatically generated.

> Apps that have completed can appear as RUNNING on the NM UI
> ---
>
> Key: YARN-376
> URL: https://issues.apache.org/jira/browse/YARN-376
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch
>
>
> On a busy cluster we've noticed a growing number of applications appear as 
> RUNNING on a nodemanager web pages but the applications have long since 
> finished.  Looking at the NM logs, it appears the RM never told the 
> nodemanager that the application had finished.  This is also reflected in a 
> jstack of the NM process, since many more log aggregation threads are running 
> then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-27 Thread Jason Lowe (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589136#comment-13589136
 ] 

Jason Lowe commented on YARN-376:
-

The eclipse failure appears to be unrelated, as it builds fine for me locally.  
Also I can't see how this change would affect the eclipse:eclipse build which 
is failing in hadoop-common.

> Apps that have completed can appear as RUNNING on the NM UI
> ---
>
> Key: YARN-376
> URL: https://issues.apache.org/jira/browse/YARN-376
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch
>
>
> On a busy cluster we've noticed a growing number of applications appear as 
> RUNNING on a nodemanager web pages but the applications have long since 
> finished.  Looking at the NM logs, it appears the RM never told the 
> nodemanager that the application had finished.  This is also reflected in a 
> jstack of the NM process, since many more log aggregation threads are running 
> then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588936#comment-13588936
 ] 

Hadoop QA commented on YARN-376:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571301/YARN-376.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:red}-1 eclipse:eclipse{color}.  The patch failed to build with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/443//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/443//console

This message is automatically generated.

> Apps that have completed can appear as RUNNING on the NM UI
> ---
>
> Key: YARN-376
> URL: https://issues.apache.org/jira/browse/YARN-376
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch
>
>
> On a busy cluster we've noticed a growing number of applications appear as 
> RUNNING on a nodemanager web pages but the applications have long since 
> finished.  Looking at the NM logs, it appears the RM never told the 
> nodemanager that the application had finished.  This is also reflected in a 
> jstack of the NM process, since many more log aggregation threads are running 
> then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-380) yarn node -status prints Last-Last-Health-Update

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588930#comment-13588930
 ] 

Hadoop QA commented on YARN-380:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571303/issues-yarn-380.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 tests included appear to have a timeout.{color}

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/442//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/442//console

This message is automatically generated.

> yarn node -status prints Last-Last-Health-Update
> 
>
> Key: YARN-380
> URL: https://issues.apache.org/jira/browse/YARN-380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: omkar vinit joshi
>  Labels: usability
> Attachments: issues-yarn-380.patch
>
>
> I assume the Last-Last-Health-Update is a typo and it should just be 
> Last-Health-Update.
> $ yarn node -status foo.com:8041
> Node Report : 
> Node-Id : foo.com:8041
> Rack : /10.10.10.0
> Node-State : RUNNING
> Node-Http-Address : foo.com:8042
> Health-Status(isNodeHealthy) : true
> Last-Last-Health-Update : 1360118400219
> Health-Report : 
> Containers : 0
> Memory-Used : 0M
> Memory-Capacity : 24576

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-380) yarn node -status prints Last-Last-Health-Update

2013-02-27 Thread omkar vinit joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

omkar vinit joshi updated YARN-380:
---

Attachment: (was: issue-yarn-380.patch)

> yarn node -status prints Last-Last-Health-Update
> 
>
> Key: YARN-380
> URL: https://issues.apache.org/jira/browse/YARN-380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: omkar vinit joshi
>  Labels: usability
> Attachments: issues-yarn-380.patch
>
>
> I assume the Last-Last-Health-Update is a typo and it should just be 
> Last-Health-Update.
> $ yarn node -status foo.com:8041
> Node Report : 
> Node-Id : foo.com:8041
> Rack : /10.10.10.0
> Node-State : RUNNING
> Node-Http-Address : foo.com:8042
> Health-Status(isNodeHealthy) : true
> Last-Last-Health-Update : 1360118400219
> Health-Report : 
> Containers : 0
> Memory-Used : 0M
> Memory-Capacity : 24576

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-380) yarn node -status prints Last-Last-Health-Update

2013-02-27 Thread omkar vinit joshi (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588907#comment-13588907
 ] 

omkar vinit joshi commented on YARN-380:


Yes .. T was for timezone.(-08:00 for PST). I am making output more readable.

> yarn node -status prints Last-Last-Health-Update
> 
>
> Key: YARN-380
> URL: https://issues.apache.org/jira/browse/YARN-380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: omkar vinit joshi
>  Labels: usability
> Attachments: issues-yarn-380.patch
>
>
> I assume the Last-Last-Health-Update is a typo and it should just be 
> Last-Health-Update.
> $ yarn node -status foo.com:8041
> Node Report : 
> Node-Id : foo.com:8041
> Rack : /10.10.10.0
> Node-State : RUNNING
> Node-Http-Address : foo.com:8042
> Health-Status(isNodeHealthy) : true
> Last-Last-Health-Update : 1360118400219
> Health-Report : 
> Containers : 0
> Memory-Used : 0M
> Memory-Capacity : 24576

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-380) yarn node -status prints Last-Last-Health-Update

2013-02-27 Thread omkar vinit joshi (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

omkar vinit joshi updated YARN-380:
---

Attachment: issues-yarn-380.patch

> yarn node -status prints Last-Last-Health-Update
> 
>
> Key: YARN-380
> URL: https://issues.apache.org/jira/browse/YARN-380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: omkar vinit joshi
>  Labels: usability
> Attachments: issues-yarn-380.patch
>
>
> I assume the Last-Last-Health-Update is a typo and it should just be 
> Last-Health-Update.
> $ yarn node -status foo.com:8041
> Node Report : 
> Node-Id : foo.com:8041
> Rack : /10.10.10.0
> Node-State : RUNNING
> Node-Http-Address : foo.com:8042
> Health-Status(isNodeHealthy) : true
> Last-Last-Health-Update : 1360118400219
> Health-Report : 
> Containers : 0
> Memory-Used : 0M
> Memory-Capacity : 24576

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-376) Apps that have completed can appear as RUNNING on the NM UI

2013-02-27 Thread Jason Lowe (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated YARN-376:


Attachment: YARN-376.patch

Thanks for the review, Sidd.  I originally had it update the heartbeat since 
the RMNode interface already knew about the heartbeat type and it's more 
efficient (don't need to create an extra copy of the app list and grab the 
write lock only once instead of twice).

Updated to change get*ToCleanup to pull*ToCleanup and test no longer needs the 
heartbeat response since it no longer updates it directly.


> Apps that have completed can appear as RUNNING on the NM UI
> ---
>
> Key: YARN-376
> URL: https://issues.apache.org/jira/browse/YARN-376
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Blocker
> Attachments: YARN-376.patch, YARN-376.patch, YARN-376.patch
>
>
> On a busy cluster we've noticed a growing number of applications appear as 
> RUNNING on a nodemanager web pages but the applications have long since 
> finished.  Looking at the NM logs, it appears the RM never told the 
> nodemanager that the application had finished.  This is also reflected in a 
> jstack of the NM process, since many more log aggregation threads are running 
> then one would expect from the number of actively running applications.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-433) When RM is catching up with node updates then it should not expire acquired containers

2013-02-27 Thread Bikas Saha (JIRA)

Bikas Saha created YARN-433:
---

 Summary: When RM is catching up with node updates then it should 
not expire acquired containers
 Key: YARN-433
 URL: https://issues.apache.org/jira/browse/YARN-433
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Xuan Gong


RM expires containers that are not launched within some time of being 
allocated. The default is 10mins. When an RM is not keeping up with node 
updates then it may not be aware of new launched containers. If the expire 
thread fires for such containers then the RM can expire them even though they 
may have launched.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1351#comment-1351
 ] 

Hadoop QA commented on YARN-237:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571286/YARN-237.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/441//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/441//console

This message is automatically generated.

> Refreshing the RM page forgets how many rows I had in my Datatables
> ---
>
> Key: YARN-237
> URL: https://issues.apache.org/jira/browse/YARN-237
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
>Reporter: Ravi Prakash
>Assignee: jian he
>  Labels: usability
> Attachments: YARN-237.patch
>
>
> If I choose a 100 rows, and then refresh the page, DataTables goes back to 
> showing me 20 rows.
> This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-02-27 Thread jian he (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he updated YARN-237:
-

Attachment: YARN-237.patch

fix the problem of state saving at RM page

> Refreshing the RM page forgets how many rows I had in my Datatables
> ---
>
> Key: YARN-237
> URL: https://issues.apache.org/jira/browse/YARN-237
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
>Reporter: Ravi Prakash
>Assignee: jian he
>  Labels: usability
> Attachments: YARN-237.patch
>
>
> If I choose a 100 rows, and then refresh the page, DataTables goes back to 
> showing me 20 rows.
> This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler

2013-02-27 Thread Xuan Gong (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588861#comment-13588861
 ] 

Xuan Gong commented on YARN-365:


bq:Do we need to worry about there being overlap between the 2 lists. i.e. a 
newlyLaunchedContainer also got completed by the time the slow RM handled the 
NM updates?

Thanks for the comments. I think we are fine here. The way to handle 
newlyLaunchedContainers is to submit a LAUNCHED event to RMContainerImpl, and 
RMContainerImpl will unregister(remove) this container from 
containerAllocationExpirer list. That is how we handle the 
newlyLaunchedContainers. It does not actually launch the container. Just tell 
the RM that this container is being used right now. 

> Each NM heartbeat should not generate an event for the Scheduler
> 
>
> Key: YARN-365
> URL: https://issues.apache.org/jira/browse/YARN-365
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Affects Versions: 0.23.5
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Fix For: 2.0.4-beta
>
> Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, 
> YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, 
> YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, 
> YARN-365.9.patch
>
>
> Follow up from YARN-275
> https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-02-27 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588856#comment-13588856
 ] 

Hadoop QA commented on YARN-198:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12571280/YARN-198.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/440//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/440//console

This message is automatically generated.

> If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
> link to navigate back to Resource manager
> ---
>
> Key: YARN-198
> URL: https://issues.apache.org/jira/browse/YARN-198
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Ramgopal N
>Assignee: jian he
>Priority: Minor
>  Labels: usability
> Attachments: YARN-198.patch
>
>
> If we are navigating to Nodemanager by clicking on the node link in RM,there 
> is no link provided on the NM to navigate back to RM.
>  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-430) Add HDFS based store for RM

2013-02-27 Thread jian he (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he reassigned YARN-430:


Assignee: jian he  (was: Bikas Saha)

> Add HDFS based store for RM
> ---
>
> Key: YARN-430
> URL: https://issues.apache.org/jira/browse/YARN-430
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Reporter: Bikas Saha
>Assignee: jian he
>
> There is a generic FileSystem store but it does not take advantage of HDFS 
> features like directories, replication, DFSClient advanced settings for HA, 
> retries etc. Writing a store thats optimized for HDFS would be good.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-380) yarn node -status prints Last-Last-Health-Update

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588840#comment-13588840
 ] 

Vinod Kumar Vavilapalli commented on YARN-380:
--

Looked at the patch. One comment:
 - When I tried to print the output by modifying the test itself, it says 
"Last-Health-Update : 1969-12-31T16:00:00-08:00", not sure if you are seeing 
the extraneous T character or not. Please verify. If it is indeed like that, we 
will need to fix it.

> yarn node -status prints Last-Last-Health-Update
> 
>
> Key: YARN-380
> URL: https://issues.apache.org/jira/browse/YARN-380
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Affects Versions: 2.0.3-alpha
>Reporter: Thomas Graves
>Assignee: omkar vinit joshi
>  Labels: usability
> Attachments: issue-yarn-380.patch
>
>
> I assume the Last-Last-Health-Update is a typo and it should just be 
> Last-Health-Update.
> $ yarn node -status foo.com:8041
> Node Report : 
> Node-Id : foo.com:8041
> Rack : /10.10.10.0
> Node-State : RUNNING
> Node-Http-Address : foo.com:8042
> Health-Status(isNodeHealthy) : true
> Last-Last-Health-Update : 1360118400219
> Health-Report : 
> Containers : 0
> Memory-Used : 0M
> Memory-Capacity : 24576

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-02-27 Thread jian he (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he updated YARN-198:
-

Attachment: YARN-198.patch

add a link at NM page to navigate back to RM page

> If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
> link to navigate back to Resource manager
> ---
>
> Key: YARN-198
> URL: https://issues.apache.org/jira/browse/YARN-198
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Ramgopal N
>Assignee: jian he
>Priority: Minor
>  Labels: usability
> Attachments: YARN-198.patch
>
>
> If we are navigating to Nodemanager by clicking on the node link in RM,there 
> is no link provided on the NM to navigate back to RM.
>  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-365) Each NM heartbeat should not generate an event for the Scheduler

2013-02-27 Thread Bikas Saha (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588818#comment-13588818
 ] 

Bikas Saha commented on YARN-365:
-

Do we need to worry about there being overlap between the 2 lists. i.e. a 
newlyLaunchedContainer also got completed by the time the slow RM handled the 
NM updates?
{code}
+  private synchronized void nodeUpdate(RMNode nm) {
 if (LOG.isDebugEnabled()) {
   LOG.debug("nodeUpdate: " + nm + " clusterResources: " + clusterResource);
 }
-  
-FiCaSchedulerNode node = getNode(nm.getNodeID());
 
+FiCaSchedulerNode node = getNode(nm.getNodeID());
+List containerInfoList = nm.pullContainerUpdates();
+List newlyLaunchedContainers = new 
ArrayList();
+List completedContainers = new 
ArrayList();
+for(UpdatedContainerInfo containerInfo : containerInfoList) {
+  
newlyLaunchedContainers.addAll(containerInfo.getNewlyLaunchedContainers());
+  completedContainers.addAll(containerInfo.getCompletedContainers());
+}
+
{code}
Note than this problem (if it is a problem) exists regardless of this change 
because a container may start and complete within the NM heartbeat interval. 
However, chances of hitting it are low before this change because the heartbeat 
interval is short and so the RM never see a node update in which the same 
container both launches and completes. After this change, with a slow RM, this 
can easily happen, specially because we are simply concatenating both sub-lists.

> Each NM heartbeat should not generate an event for the Scheduler
> 
>
> Key: YARN-365
> URL: https://issues.apache.org/jira/browse/YARN-365
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager, scheduler
>Affects Versions: 0.23.5
>Reporter: Siddharth Seth
>Assignee: Xuan Gong
> Fix For: 2.0.4-beta
>
> Attachments: Prototype2.txt, Prototype3.txt, YARN-365.10.patch, 
> YARN-365.1.patch, YARN-365.2.patch, YARN-365.3.patch, YARN-365.4.patch, 
> YARN-365.5.patch, YARN-365.6.patch, YARN-365.7.patch, YARN-365.8.patch, 
> YARN-365.9.patch
>
>
> Follow up from YARN-275
> https://issues.apache.org/jira/secure/attachment/12567075/Prototype.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-432) Documentation for Log Aggregation and log retrieval.

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-432:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-431

> Documentation for Log Aggregation and log retrieval.
> 
>
> Key: YARN-432
> URL: https://issues.apache.org/jira/browse/YARN-432
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Mahadev konar
>Assignee: Siddharth Seth
>
> Retrieving logs in 0.23 is very different from what 0.20.* does. This is a 
> very new feature which will require good documentation for users to get used 
> to it. Lets make sure we have some solid documentation for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Moved] (YARN-432) Documentation for Log Aggregation and log retrieval.

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli moved MAPREDUCE-3743 to YARN-432:
-

  Component/s: (was: mrv2)
Affects Version/s: (was: 0.23.0)
  Key: YARN-432  (was: MAPREDUCE-3743)
  Project: Hadoop YARN  (was: Hadoop Map/Reduce)

> Documentation for Log Aggregation and log retrieval.
> 
>
> Key: YARN-432
> URL: https://issues.apache.org/jira/browse/YARN-432
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Mahadev konar
>Assignee: Siddharth Seth
>
> Retrieving logs in 0.23 is very different from what 0.20.* does. This is a 
> very new feature which will require good documentation for users to get used 
> to it. Lets make sure we have some solid documentation for this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-395) RM should have a way to disable scheduling to a set of nodes

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-395:
-

Issue Type: Sub-task  (was: Improvement)
Parent: YARN-397

> RM should have a way to disable scheduling to a set of nodes
> 
>
> Key: YARN-395
> URL: https://issues.apache.org/jira/browse/YARN-395
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Bikas Saha
>Assignee: Arun C Murthy
>
> There should be a way to say schedule to A, B and C but never to D.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-173) Page navigation support for container logs page

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-173:
-

Labels: usability  (was: )

> Page navigation support for container logs page
> ---
>
> Key: YARN-173
> URL: https://issues.apache.org/jira/browse/YARN-173
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.2-alpha, 0.23.3
>Reporter: Jason Lowe
>  Labels: usability
>
> ContainerLogsPage and AggregatedLogsBlock both support {{start}} and {{end}} 
> parameters which are a big help when trying to sift through a huge log.  
> However it's annoying to have to manually edit the URL to go through a giant 
> log page-by-page.  It would be very handy if the web page also provided page 
> navigation links so flipping to the next/previous/first/last chunk of log is 
> a simple click away.  Bonus points for providing a way to easily change the 
> size of the log chunk shown per page.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-200) yarn log does not output all needed information, and is in a binary format

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-200:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-431

> yarn log does not output all needed information, and is in a binary format
> --
>
> Key: YARN-200
> URL: https://issues.apache.org/jira/browse/YARN-200
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 0.23.5
>Reporter: Robert Joseph Evans
>  Labels: usability
>
> yarn logs does not output attemptid, nodename, or container-id.  Missing 
> these makes it very difficult to look through the logs for failed containers 
> and tie them back to actual tasks and task attempts.
> Also the output currently includes several binary characters.  This is OK for 
> being machine readable, but difficult for being human readable, or even for 
> using standard tool like grep.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-200) yarn log does not output all needed information, and is in a binary format

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-200:
-

Labels: usability  (was: )

> yarn log does not output all needed information, and is in a binary format
> --
>
> Key: YARN-200
> URL: https://issues.apache.org/jira/browse/YARN-200
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 0.23.5
>Reporter: Robert Joseph Evans
>  Labels: usability
>
> yarn logs does not output attemptid, nodename, or container-id.  Missing 
> these makes it very difficult to look through the logs for failed containers 
> and tie them back to actual tasks and task attempts.
> Also the output currently includes several binary characters.  This is OK for 
> being machine readable, but difficult for being human readable, or even for 
> using standard tool like grep.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM

2013-02-27 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588696#comment-13588696
 ] 

Hitesh Shah commented on YARN-196:
--

Also, description in yarn-default.xml should mention the value is specified in 
seconds.

> Nodemanager if started before starting Resource manager is getting 
> shutdown.But if both RM and NM are started and then after if RM is going 
> down,NM is retrying for the RM.
> ---
>
> Key: YARN-196
> URL: https://issues.apache.org/jira/browse/YARN-196
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Ramgopal N
>Assignee: Xuan Gong
> Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, 
> YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch
>
>
> If NM is started before starting the RM ,NM is shutting down with the 
> following error
> {code}
> ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
> services org.apache.hadoop.yarn.server.nodemanager.NodeManager
> org.apache.avro.AvroRuntimeException: 
> java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
>   at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
> Caused by: java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
>   ... 3 more
> Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
> Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
> connection exception: java.net.ConnectException: Connection refused; For more 
> details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>   at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
>   at $Proxy23.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>   ... 5 more
> Caused by: java.net.ConnectException: Call From 
> HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
> exception: java.net.ConnectException: Connection refused; For more details 
> see:  http://wiki.apache.org/hadoop/ConnectionRefused
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
>   at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
>   ... 7 more
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
>   at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1247)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1117)
>   ... 9 more
> 2012-01-16 15:04:13,336 WARN org.apache.hadoop.yarn.event.AsyncDispatcher: 
> AsyncDispatcher thread interrupted
> java.lang.InterruptedException
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)
>   at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
>   at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:76)
>   at java.l

[jira] [Resolved] (YARN-324) Provide way to preserve container directories

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-324.
--

Resolution: Invalid

Lohit, as Jason mentioned,  yarn.nodemanager.delete.debug-delay-sec should work 
for you. Please reopen this ticket if you disagree.

> Provide way to preserve container directories
> -
>
> Key: YARN-324
> URL: https://issues.apache.org/jira/browse/YARN-324
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager, resourcemanager
>Affects Versions: 2.0.3-alpha
>Reporter: Lohit Vijayarenu
>
> There should be a way to preserve container directories (along with 
> filecache/appcache) for offline debugging. As of today, if container 
> completes (either success or failure) it would get cleaned up. In case of 
> failure it becomes very hard to debug to find out what the case of failure 
> is. Having ability to preserve container directories will enable one to log 
> into the machine and debug further for failures. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-196) Nodemanager if started before starting Resource manager is getting shutdown.But if both RM and NM are started and then after if RM is going down,NM is retrying for the RM

2013-02-27 Thread Hitesh Shah (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588694#comment-13588694
 ] 

Hitesh Shah commented on YARN-196:
--

+  public static final int DEFAULT_RESOURCEMANAGER_CONNECT_WAIT_SECS = 
+  15*60*1000;
+  public static final long DEFAULT_RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_SECS 
=
+  30*1000;

Variable says seconds but value is still in milliseconds? 
Likewise for yarn-default.xml 

+long rmConnectWaitMS =
+conf.getInt(
+YarnConfiguration.RESOURCEMANAGER_CONNECT_WAIT_SECS,
+YarnConfiguration.DEFAULT_RESOURCEMANAGER_CONNECT_WAIT_SECS);
+long rmConnectionRetryIntervalMS =
+conf.getLong(
+YarnConfiguration.RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_SECS,
+
YarnConfiguration.DEFAULT_RESOURCEMANAGER_CONNECT_RETRY_INTERVAL_SECS);

Above variables could be set using *1000 to keep code clean. Special handling 
needed for -1. 



> Nodemanager if started before starting Resource manager is getting 
> shutdown.But if both RM and NM are started and then after if RM is going 
> down,NM is retrying for the RM.
> ---
>
> Key: YARN-196
> URL: https://issues.apache.org/jira/browse/YARN-196
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 3.0.0, 2.0.0-alpha
>Reporter: Ramgopal N
>Assignee: Xuan Gong
> Attachments: MAPREDUCE-3676.patch, YARN-196.1.patch, 
> YARN-196.2.patch, YARN-196.3.patch, YARN-196.4.patch, YARN-196.5.patch
>
>
> If NM is started before starting the RM ,NM is shutting down with the 
> following error
> {code}
> ERROR org.apache.hadoop.yarn.service.CompositeService: Error starting 
> services org.apache.hadoop.yarn.server.nodemanager.NodeManager
> org.apache.avro.AvroRuntimeException: 
> java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:149)
>   at 
> org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:167)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:242)
> Caused by: java.lang.reflect.UndeclaredThrowableException
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:182)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:145)
>   ... 3 more
> Caused by: com.google.protobuf.ServiceException: java.net.ConnectException: 
> Call From HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on 
> connection exception: java.net.ConnectException: Connection refused; For more 
> details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>   at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:131)
>   at $Proxy23.registerNodeManager(Unknown Source)
>   at 
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
>   ... 5 more
> Caused by: java.net.ConnectException: Call From 
> HOST-10-18-52-230/10.18.52.230 to HOST-10-18-52-250:8025 failed on connection 
> exception: java.net.ConnectException: Connection refused; For more details 
> see:  http://wiki.apache.org/hadoop/ConnectionRefused
>   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:857)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1141)
>   at org.apache.hadoop.ipc.Client.call(Client.java:1100)
>   at 
> org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:128)
>   ... 7 more
> Caused by: java.net.ConnectException: Connection refused
>   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>   at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>   at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:659)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:469)
>   at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:563)
>   at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:211)
>   at org.apache.hadoop.ipc.Client.getConnection(Client.java:1

[jira] [Updated] (YARN-226) Log aggregation should not assume an AppMaster will have containerId 1

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-226:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-431

> Log aggregation should not assume an AppMaster will have containerId 1
> --
>
> Key: YARN-226
> URL: https://issues.apache.org/jira/browse/YARN-226
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>
> In case of reservcations, etc - AppMasters may not get container id 1. We 
> likely need additional info in the CLC / tokens indicating whether a 
> container is an AM or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-126) yarn rmadmin help message contains reference to hadoop cli and JT

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-126:
-

Labels: usability  (was: )

> yarn rmadmin help message contains reference to hadoop cli and JT
> -
>
> Key: YARN-126
> URL: https://issues.apache.org/jira/browse/YARN-126
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: client
>Reporter: Thomas Graves
>  Labels: usability
>
> has option to specify a job tracker and the last line for general command 
> line syntax had "bin/hadoop command [genericOptions] [commandOptions]"
> ran "yarn rmadmin" to get usage:
> RMAdmin
> Usage: java RMAdmin
>[-refreshQueues]
>[-refreshNodes]
>[-refreshUserToGroupsMappings]
>[-refreshSuperUserGroupsConfiguration]
>[-refreshAdminAcls]
>[-refreshServiceAcl]
>[-help [cmd]]
> Generic options supported are
> -conf  specify an application configuration file
> -D use value for given property
> -fs   specify a namenode
> -jt specify a job tracker
> -files specify comma separated files to be 
> copied to the map reduce cluster
> -libjars specify comma separated jar files 
> to include in the classpath.
> -archives specify comma separated 
> archives to be unarchived on the compute machines.
> The general command line syntax is
> bin/hadoop command [genericOptions] [commandOptions]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-171) NodeManager should serve logs directly if log-aggregation is not enabled

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-171:
-

Issue Type: Sub-task  (was: Bug)
Parent: YARN-431

> NodeManager should serve logs directly if log-aggregation is not enabled
> 
>
> Key: YARN-171
> URL: https://issues.apache.org/jira/browse/YARN-171
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Affects Versions: 0.23.3
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Siddharth Seth
> Attachments: YARN171_WIP.txt
>
>
> NodeManagers never serve logs for completed applications. If log-aggregation 
> is not enabled, in the interim, due to bugs like YARN-162, this is a serious 
> problem for users as logs are necessarily not available.
> We should let nodes serve logs directly if 
> YarnConfiguration.LOG_AGGREGATION_ENABLED is set. This should be okay as 
> NonAggregatingLogHandler can retain logs upto 
> YarnConfiguration.NM_LOG_RETAIN_SECONDS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-431) [Umbrella] Complete/Stabilize YARN appliation log-aggregation

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)

Vinod Kumar Vavilapalli created YARN-431:


 Summary: [Umbrella] Complete/Stabilize YARN appliation 
log-aggregation
 Key: YARN-431
 URL: https://issues.apache.org/jira/browse/YARN-431
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Vinod Kumar Vavilapalli




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-85) Allow per job log aggregation configuration

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-85?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-85:


Issue Type: Sub-task  (was: Improvement)
Parent: YARN-431

> Allow per job log aggregation configuration
> ---
>
> Key: YARN-85
> URL: https://issues.apache.org/jira/browse/YARN-85
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
>Priority: Critical
>
> Currently, if log aggregation is enabled for a cluster - logs for all jobs 
> will be aggregated - leading to a whole bunch of files on hdfs which users 
> may not want.
> Users should be able to control this along with the aggregation policy - 
> failed only, all, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-239) Make link in "Aggregation is not enabled. Try the nodemanager at"

2013-02-27 Thread Vinod Kumar Vavilapalli (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-239:
-

Labels: usability  (was: )

> Make link in "Aggregation is not enabled. Try the nodemanager at"
> -
>
> Key: YARN-239
> URL: https://issues.apache.org/jira/browse/YARN-239
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 2.0.0-alpha
>Reporter: Radim Kolar
>Priority: Trivial
>  Labels: usability
>
> if log aggregation is disabled message is displayed 
> *Aggregation is not enabled. Try the nodemanager at reavers.com:9006*
> It would be helpfull to make link to nodemanager clickable.
> This message is located in 
> /hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/log/AggregatedLogsBlock.java
>  but i could not figure out how to make link in hamlet framework.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Chris Riccomini (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588671#comment-13588671
 ] 

Chris Riccomini commented on YARN-417:
--

Hey Guys,

I also agree with the comments and nits that Sandy/Karthik pointed out.

Just so I'm clear, the way this would be used a main thread would be:

1. instantiate
2. call init
3. call start
4. call register
5. make initial container request
6. wait until containers complete

I think what I'd probably end up doing for #6 is just using a countdown latch 
that I'd wait on, and the callback decrements whenever a container completes. 
Probably good enough.

Cheers,
Chris

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417.patch, YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-198) If we are navigating to Nodemanager UI from Resourcemanager,then there is not link to navigate back to Resource manager

2013-02-27 Thread jian he (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he reassigned YARN-198:


Assignee: jian he  (was: Senthil V Kumar)

Hey Senthil, I have a patch for this, mind if I take this over?

> If we are navigating to Nodemanager UI from Resourcemanager,then there is not 
> link to navigate back to Resource manager
> ---
>
> Key: YARN-198
> URL: https://issues.apache.org/jira/browse/YARN-198
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Ramgopal N
>Assignee: jian he
>Priority: Minor
>  Labels: usability
>
> If we are navigating to Nodemanager by clicking on the node link in RM,there 
> is no link provided on the NM to navigate back to RM.
>  If there is a link to navigate back to RM it would be good

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (YARN-237) Refreshing the RM page forgets how many rows I had in my Datatables

2013-02-27 Thread jian he (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jian he reassigned YARN-237:


Assignee: jian he

> Refreshing the RM page forgets how many rows I had in my Datatables
> ---
>
> Key: YARN-237
> URL: https://issues.apache.org/jira/browse/YARN-237
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 2.0.2-alpha, 0.23.4, 3.0.0
>Reporter: Ravi Prakash
>Assignee: jian he
>  Labels: usability
>
> If I choose a 100 rows, and then refresh the page, DataTables goes back to 
> showing me 20 rows.
> This user preference should be stored in a cookie.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (YARN-430) Add HDFS based store for RM

2013-02-27 Thread Bikas Saha (JIRA)

Bikas Saha created YARN-430:
---

 Summary: Add HDFS based store for RM
 Key: YARN-430
 URL: https://issues.apache.org/jira/browse/YARN-430
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Bikas Saha
Assignee: Bikas Saha


There is a generic FileSystem store but it does not take advantage of HDFS 
features like directories, replication, DFSClient advanced settings for HA, 
retries etc. Writing a store thats optimized for HDFS would be good.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Sandy Ryza (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-417:


Attachment: AMRMClientAsync-1.java

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync-1.java, AMRMClientAsync.java, 
> YARN-417.patch, YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Sandy Ryza (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588552#comment-13588552
 ] 

Sandy Ryza commented on YARN-417:
-

Thanks for the interface proposal, Bikas.  It looks good to me.  Having a 
separate method invocation for each completed container shouldn't have 
significant performance impact, as Java inlines even virtual methods when it 
needs to 
(http://www.quora.com/How-many-CPU-instructions-are-typical-for-Java-method-call-overhead),
 but the single call with the list doesn't seem any worse to me.

Nits:
* missing an onReboot method
* ContainerCompletionStatus is describing a cause more than a state, so a name 
like ContainerCompletionReason fits a little better to me
* agree with Karthik that it would be much more intuitive for 
getContainerCompletionStatus to be in ContainerStatus.  Is there a strong 
reason against this?

Attaching an updated proposal

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync.java, YARN-417.patch, 
> YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Karthik Kambatla (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588501#comment-13588501
 ] 

Karthik Kambatla commented on YARN-417:
---

Thanks Bikas, the interface looks good. One comment though - shouldn't we move 
{{ContainerCompletionStatus}} and {{getContainerCompletionStatus}} to 
{{ContainerStatus}}?

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync.java, YARN-417.patch, 
> YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588437#comment-13588437
 ] 

Hudson commented on YARN-426:
-

Integrated in Hadoop-trunk-Commit #3390 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/3390/])
YARN-426. Failure to download a public resource prevents further downloads 
(Jason Lowe via bobby) (Revision 1450807)

 Result = SUCCESS
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1450807
Files : 
* /hadoop/common/trunk/hadoop-yarn-project/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* 
/hadoop/common/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/TestResourceLocalizationService.java


> Failure to download a public resource on a node prevents further downloads of 
> the resource from that node
> -
>
> Key: YARN-426
> URL: https://issues.apache.org/jira/browse/YARN-426
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Fix For: 3.0.0, 0.23.7, 2.0.4-beta
>
> Attachments: YARN-426.patch
>
>
> If the NM encounters an error while downloading a public resource, it fails 
> to empty the list of request events corresponding to the resource request in 
> {{attempts}}.  If the same public resource is subsequently requested on that 
> node, {{PublicLocalizer.addResource}} will skip the download since it will 
> mistakenly believe a download of that resource is already in progress.  At 
> that point any container that requests the public resource will just hang in 
> the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (YARN-426) Failure to download a public resource on a node prevents further downloads of the resource from that node

2013-02-27 Thread Robert Joseph Evans (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588424#comment-13588424
 ] 

Robert Joseph Evans commented on YARN-426:
--

The patch looks good to me. +1 I'll check it in.

> Failure to download a public resource on a node prevents further downloads of 
> the resource from that node
> -
>
> Key: YARN-426
> URL: https://issues.apache.org/jira/browse/YARN-426
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.0.3-alpha, 0.23.6
>Reporter: Jason Lowe
>Assignee: Jason Lowe
>Priority: Critical
> Attachments: YARN-426.patch
>
>
> If the NM encounters an error while downloading a public resource, it fails 
> to empty the list of request events corresponding to the resource request in 
> {{attempts}}.  If the same public resource is subsequently requested on that 
> node, {{PublicLocalizer.addResource}} will skip the download since it will 
> mistakenly believe a download of that resource is already in progress.  At 
> that point any container that requests the public resource will just hang in 
> the {{LOCALIZING}} state.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (YARN-417) Add a poller that allows the AM to receive notifications when it is assigned containers

2013-02-27 Thread Bikas Saha (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bikas Saha updated YARN-417:


Attachment: AMRMClientAsync.java

Attaching an interface proposal

> Add a poller that allows the AM to receive notifications when it is assigned 
> containers
> ---
>
> Key: YARN-417
> URL: https://issues.apache.org/jira/browse/YARN-417
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: api, applications
>Affects Versions: 2.0.3-alpha
>Reporter: Sandy Ryza
>Assignee: Sandy Ryza
> Attachments: AMRMClientAsync.java, YARN-417.patch, 
> YarnAppMaster.java, YarnAppMasterListener.java
>
>
> Writing AMs would be easier for some if they did not have to handle 
> heartbeating to the RM on their own.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

49 matches

Mail list logo