date:20160911

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

2016-09-11 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483274#comment-15483274
 ] 

Jian He commented on YARN-5620:
---

bq. the container should be killable explicitly via an external signal.
IIUC,  in this case, the ContainerImpl will receive the KILL event first and 
move to the KILLING state, and the CONTAINER_KILLED_ON_REQUEST will be sent to 
the container at KILLING state.  

> Core changes in NodeManager to support for upgrade and rollback of Containers
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483276#comment-15483276
 ] 

Sunil G commented on YARN-5631:
---

Looks good for me.

> Missing refreshClusterMaxPriority usage in rmadmin help message
> ---
>
> Key: YARN-5631
> URL: https://issues.apache.org/jira/browse/YARN-5631
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha2
>Reporter: Kai Sasaki
>Assignee: Kai Sasaki
>Priority: Minor
> Attachments: YARN-5631.01.patch, YARN-5631.02.patch
>
>
> {{rmadmin -help}} does not show {{-refreshClusterMaxPriority}} option in 
> usage line.
> {code}
> $ bin/yarn rmadmin -help
> rmadmin is the command to execute YARN administrative commands.
> The full syntax is:
> yarn rmadmin [-refreshQueues] [-refreshNodes [-g|graceful [timeout in 
> seconds] -client|server]] [-refreshNodesResources] 
> [-refreshSuperUserGroupsConfiguration] [-refreshUserToGroupsMappings] 
> [-refreshAdminAcls] [-refreshServiceAcl] [-getGroup [username]] 
> [-addToClusterNodeLabels 
> <"label1(exclusive=true),label2(exclusive=false),label3">] 
> [-removeFromClusterNodeLabels ] [-replaceLabelsOnNode 
> <"node1[:port]=label1,label2 node2[:port]=label1">] 
> [-directlyAccessNodeLabelStore] [-updateNodeResource [NodeID] [MemSize] 
> [vCores] ([OvercommitTimeout]) [-help [cmd]]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-4855) Should check if node exists when replace nodelabels

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483269#comment-15483269
 ] 

Sunil G commented on YARN-4855:
---

HI [~Tao Jie] and [~Naganarasimha Garla]

Thanks for the work on this item. I have doubts on the patch.

- Thinking out node, {{replaceLabelsOnNodes}} seems a little complex for me. 
Could we add a server side API to check whether set of nodes are registered or 
not? {{ConcurrentMap getRMNodes()}} is available at server 
side. So it will be faster and better approach.
- If we are sticking with client side impl, then we could try make use of 
{{NodeId#compareTo}} instead of {{isNodeSame}}


> Should check if node exists when replace nodelabels
> ---
>
> Key: YARN-4855
> URL: https://issues.apache.org/jira/browse/YARN-4855
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
>Priority: Minor
> Attachments: YARN-4855.001.patch, YARN-4855.002.patch, 
> YARN-4855.003.patch, YARN-4855.004.patch, YARN-4855.005.patch, 
> YARN-4855.006.patch, YARN-4855.007.patch, YARN-4855.008.patch
>
>
> Today when we add nodelabels to nodes, it would succeed even if nodes are not 
> existing NodeManger in cluster without any message.
> It could be like this:
> When we use *yarn rmadmin -replaceLabelsOnNode --fail-on-unkown-nodes 
> "node1=label1"* , it would be denied if node is unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5587) Add support for resource profiles

2016-09-11 Thread Varun Vasudev (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Vasudev updated YARN-5587:

Attachment: YARN-5587-YARN-3926.001.patch

> Add support for resource profiles
> -
>
> Key: YARN-5587
> URL: https://issues.apache.org/jira/browse/YARN-5587
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Attachments: YARN-5587-YARN-3926.001.patch
>
>
> Add support for resource profiles on the RM side to allow users to use 
> shorthands to specify resource requirements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-11 Thread Varun Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483258#comment-15483258
 ] 

Varun Saxena commented on YARN-5561:


bq.  are looking for a complete applications page where all applications which 
were running/completed had to be listed. For this purpose, I think we need the 
api as suggested by Rohith. Being said this, We will also be showing hierarchy 
from flows too. 
So you plan to have 2 app pages. One from a specific flow run and other a list 
of all the apps in a cluster. Right ?
[~rohithsharma], how do you plan to support fetching all apps within a cluster ?
Probably you can adopt the approach I had suggested. Because otherwise it would 
lead to full table scan.

bq. New API's required. Thoughts?
We should have them for the sake of completeness.

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-11 Thread Bibin A Chundatt (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483254#comment-15483254
 ] 

Bibin A Chundatt commented on YARN-5545:


[~Naganarasimha Garla] and [~sunilg]

{quote}
 Also consider the cases when the accessibility is * and new partitions are 
added without refreshing, this configuration will be wrong as its static.
{quote}
Thank you for pointing out will check the same. But [~Naganarasimha Garla]  
when ever we reconfigure capacity scheduler xml this limits also will get 
refreshed.

{quote}
Would it be better to set the default value of 
yarn.scheduler.capacity.maximum-applications.accessible-node-labels. to 
that of yarn.scheduler.capacity.maximum-applications
{quote}
Will use  {{yarn.scheduler.capacity.maximum-applications}} itself.

{quote}
IIUC you seem to adopt the approach little different than what you mention in 
your comment, though we are having per partition level max app limit, we just 
sum up max limits of all partitions under a queue and check against 
ApplicationLimit.getAllMaxApplication()
{quote}
This was added since application per partition we can't consider for app limit 
IIUC we have to check max apps to queue from all partitions.

Documentation will add for the same.





> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN :

[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483222#comment-15483222
 ] 

Sunil G commented on YARN-5148:
---

Thanks [~kaisasak] for the update.

Some comments from UI screen shot

1. Could we categorize with labels like YARN, MR etc?
2. I think logs page has to be thought about. If we are planning to show last N 
number of line, and then download full log, we can try with a better ui look. 
Could you please share a screen shot. Similarly for JMX.

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, yarn-conf.png, 
> yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5586) Update the Resources class to consider all resource types

2016-09-11 Thread Varun Vasudev (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483198#comment-15483198
 ] 

Varun Vasudev commented on YARN-5586:
-

Thanks for the review and commit [~rohithsharma]

> Update the Resources class to consider all resource types
> -
>
> Key: YARN-5586
> URL: https://issues.apache.org/jira/browse/YARN-5586
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Varun Vasudev
>Assignee: Varun Vasudev
> Fix For: YARN-3926
>
> Attachments: YARN-5586-YARN-3926.001.patch, 
> YARN-5586-YARN-3926.002.patch
>
>
> The Resources class provides a bunch of useful functions like clone, addTo, 
> etc. These need to be updated to consider all resource types instead of just 
> memory and cpu.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

2016-09-11 Thread Arun Suresh (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483158#comment-15483158
 ] 

Arun Suresh commented on YARN-5620:
---

Thanks [~jianhe]..

Will update patch shortly with your suggestions. But, with regard to this :
bq. Looks like we don’t need the killedForReInitialization flag in 
ContainerLaunch, because container_killed event can already be distinguished 
based on whether container is at Reinit or running state.
This we actually do need the flag.. Since even while the container is 
re-initing (while resource is localizing but before the container_cleanup 
signal is sent), the container should be killable explicitly via an external 
signal. The {{ContainerLaunch}} should be able to know the difference and after 
killing, it should either {{ContainerEventType.CONTAINER_KILLED_ON_REQUEST}} or 
{{ContainerEventType.CONTAINER_KILLED_FOR_REINIT}}. That reminds me... I have 
to add the following transition as well:
{code}
.addTransition(ContainerState.REINITIALIZING,
ContainerState.EXITED_WITH_FAILURE,
ContainerEventType.CONTAINER_KILLED_ON_REQUEST,
new KilledExternallyTransition())
{code}

> Core changes in NodeManager to support for upgrade and rollback of Containers
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483137#comment-15483137
 ] 

Sunil G commented on YARN-5545:
---

[~Naganarasimha Garla] and [~bibinchundatt]

I earlier suggested to have "maximum-applications" per label. And as mentioned 
by Naga in the last summary, it is one of the option to control apps for 
labels. 
However it may be an added hurdle for admins to set it correctly per-label. 
Also I had discussed with [~leftnoteasy] earlier.I think its better if we have  
{{maximum-applications per  }} in cluster-wise (as mentioned in option2 
with slight difference) to that of 
{{yarn.scheduler.capacity.maximum-applications}}. May be we need not have to 
expose this as a new config. Rather we can adopt it from 
{{yarn.scheduler.capacity.maximum-applications}} itself. It could be documented 
to explain this. Thoughts?

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit 
> application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
>   at 
> org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

2016-09-11 Thread Jian He (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483135#comment-15483135
 ] 

Jian He commented on YARN-5620:
---

Arun, thanks for updating ! looks good to me overall, few more comments:
- Unused parameter container in createReInitContext
- We can use the ResourceSet#getAllResourcesByVisibility method instead, and so 
the getLocalPendingRequests method and the new constructor in 
ContainerLocalizationRequestEvent is not needed
{code}
Collection reqs = getLocalPendingRequests(event);
if (!reqs.isEmpty()) {
  container.dispatcher.getEventHandler().handle( 
  new ContainerLocalizationRequestEvent(container, reqs));
{code}
- In ResourceLocalizedWhileReInitTransition, why is checkAndUpdatePending 
needed ? why do we need to pass in a hashset to store the links ? I think the 
symlinks should be always distinct. I wonder we can can just call 
“resourceSet.resourceLocalized”
- The newly added interfaces 
(startReInitialization/completeReInitialization/isReInitializing) in NMContext. 
Thinking if we can have a flag in ContainerImpl for this.. the advantage is 
that it won’t cause object leak for any reason we forgot to remove the 
container from the set.  Else, we can expose a single 
getReinitializingContainers() in the NMContext instead of three, as a getter 
API looks conforming more with the other APIs in the NMContext.
- we lost the tracking of the failed resource if we set the reInitContext to 
null? probably we should add the failed resource to ContainerImpl.resourceSet
{code}
LOG.error("Container [" + container.getContainerId() + "] Re-init" +
    " failed !! Resource [" + failedEvent.getResource() + "] could" +
    " not be localized !!"); 
container.reInitContext = null;
{code}
- Looks like we don’t need the killedForReInitialization flag in 
ContainerLaunch, because container_killed event can already be distinguished 
based on whether container is at Reinit or running state.
- nit: fix the format of the new imports in ContainerManagerImpl
- the preUpgradeCheck method can also be reused by the localize API.

Tests looks good, only very minor things:
- could you add comments to prepareInitialContainer/prepareContainerUpgrade 
method about what’s it doing, that helps people to understand without reading 
through the code
- {{Wait for new processStartfile to be creases}} typo : to be created
bq. Do you think we should cleanup the old script file ? If the upgrade uses 
the same script file name, it will be overwritten right ? the token file is 
anyway overwritten right ?
I agree we don't need to clean up old files, if it can be overwritten.

> Core changes in NodeManager to support for upgrade and rollback of Containers
> -
>
> Key: YARN-5620
> URL: https://issues.apache.org/jira/browse/YARN-5620
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Arun Suresh
>Assignee: Arun Suresh
> Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch, YARN-5620.004.patch, YARN-5620.005.patch, 
> YARN-5620.006.patch, YARN-5620.007.patch, YARN-5620.008.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-11 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483122#comment-15483122
 ] 

Rohith Sharma K S commented on YARN-5561:
-

Thanks [~sunilg] for quick response. I will update patch accordingly. 

Along with below API's additionally 2 more API require I think. Thoughts?
{noformat}
2. (cluster - )app - app_attempt - container sequence (6)

  - /clusters/{clusterid}/apps/{appid}/appattempts/{appattemptid}/containers
  - /clusters/{clusterid}/apps/{appid}/appattempts
  - /clusters/{clusterid}/apps/{appid}/
  - /apps/{appid}/appattempts/{appattemptid}/containers
  - /apps/{appid}/appattempts
  - /apps/{appid}/

New API's required. Thoughts?
  - /clusters/{clusterid}/apps/{appid}/appattempts/{appattemptid}
  - 
/clusters/{clusterid}/apps/{appid}/appattempts/{appattemptid}/containers/{contianer-id}
{noformat}


> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483103#comment-15483103
 ] 

Sunil G commented on YARN-3692:
---

I think for trunk, patch looks fine. Few minor nits.

1. KillApplicationRequestPBImpl.java
{{getDiagnostics}} might not need the below null check.
{code}
if (!p.hasDiagnostics()) {
  return null;
}
{code}

2. RMWebServices#updateAppState doesn't seem to have test coverage.Hence new 
change is not validated in this. I think a separate test jira could be added to 
handle the same.
3. One nit
May be in ResourceMgrDelegate#killApplication, diagnosis ==> diagnostics

As mentioned by you, more compatible approach is needed for branch-2.

> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, 
> 0003-YARN-3692.patch, 0004-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-11 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482923#comment-15482923
 ] 

Rohith Sharma K S edited comment on YARN-5561 at 9/12/16 4:26 AM:
--

In *Li lu*'s REST endpoint list, there is one end point which is missed i.e 
listing all the apps for given flow name 
*/users/\{userid\}/flows/\{flowname\}/apps/*. This also use full API

bq. And based on use case of Rohith maybe list all apps within a cluster as 
well.
This is one of the objective of this JIRA when I reported. But there is another 
end point from which apps can be retrieved i.e at flow name, the client can get 
all the flows in the cluster and for each flow, client can retrieve the apps 
using above specified path. So, let me confirm with Sunil does new UI has such 
use case to get all the apps per cluster. If so, I will refresh the patch. 
cc:/ [~sunilg]


was (Author: rohithsharma):
In *Li lu*'s REST endpoint list, there is one end point which is missed i.e 
listing all the apps for given flow name 
*/users/{userid}/flows/{flowname}/apps/*. 

bq. And based on use case of Rohith maybe list all apps within a cluster as 
well.
This is one of the objective of this JIRA when I reported. But there is another 
end point from which apps can be retrieved i.e at flow name, the client can get 
all the flows in the cluster and for each flow, client can retrieve the apps 
using above specified path. So, let me confirm with Sunil does new UI has such 
use case to get all the apps per cluster. If so, I will refresh the patch. 
cc:/ [~sunilg]

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-11 Thread Sunil G (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482969#comment-15482969
 ] 

Sunil G commented on YARN-5561:
---

Hi [~rohithsharma]

Yes. We are looking for a complete applications page where all applications 
which were running/completed had to be listed. For this purpose, I think we 
need the api as suggested by Rohith. Being said this, We will also be showing 
hierarchy from flows too. Once end use land on this applications page, various 
filters/views could be derived. Hence we could also cover or show details of 
flows.

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application

2016-09-11 Thread Rohith Sharma K S (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-3692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-3692:

Attachment: 0004-YARN-3692.patch

Updated the patch fixing compilation errors. Request for review

> Allow REST API to set a user generated message when killing an application
> --
>
> Key: YARN-3692
> URL: https://issues.apache.org/jira/browse/YARN-3692
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Rajat Jain
>Assignee: Rohith Sharma K S
> Attachments: 0001-YARN-3692.patch, 0002-YARN-3692.patch, 
> 0003-YARN-3692.patch, 0004-YARN-3692.patch
>
>
> Currently YARN's REST API supports killing an application without setting a 
> diagnostic message. It would be good to provide that support.
> *Use Case*
> Usually this helps in workflow management in a multi-tenant environment when 
> the workflow scheduler (or the hadoop admin) wants to kill a job - and let 
> the user know the reason why the job was killed. Killing the job by setting a 
> diagnostic message is a very good solution for that. Ideally, we can set the 
> diagnostic message on all such interface:
> yarn kill -applicationId ... -diagnosticMessage "some message added by 
> admin/workflow"
> REST API { 'state': 'KILLED', 'diagnosticMessage': 'some message added by 
> admin/workflow'}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

2016-09-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482929#comment-15482929
 ] 

Hadoop QA commented on YARN-5621:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
23s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 37s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
16s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
52s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
29s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 31s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 31s 
{color} | {color:green} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager
 generated 0 new + 15 unchanged - 2 fixed = 15 total (was 17) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
22s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 33s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
13s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 
57s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 14m 32s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
18s {color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 11s {color} 
| {color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12827955/YARN-5621.5.patch |
| JIRA Issue | YARN-5621 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  cc  |
| uname | Linux 63f480d6d115 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / cc01ed70 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/13077/testReport/ |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 U: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/13077/console |
| Powered by | Apache Yetus 0.3.0   http://yetus.apache.org |


This message was automatically generated.



> Support LinuxContainerExecutor to create symlinks for continuously localized 
> resources
> --
>
> Key: YARN-5621
> URL: https://issues.apa

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

2016-09-11 Thread Rohith Sharma K S (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482923#comment-15482923
 ] 

Rohith Sharma K S commented on YARN-5561:
-

In *Li lu*'s REST endpoint list, there is one end point which is missed i.e 
listing all the apps for given flow name 
*/users/{userid}/flows/{flowname}/apps/*. 

bq. And based on use case of Rohith maybe list all apps within a cluster as 
well.
This is one of the objective of this JIRA when I reported. But there is another 
end point from which apps can be retrieved i.e at flow name, the client can get 
all the flows in the cluster and for each flow, client can retrieve the apps 
using above specified path. So, let me confirm with Sunil does new UI has such 
use case to get all the apps per cluster. If so, I will refresh the patch. 
cc:/ [~sunilg]

> [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and 
> entities via REST
> ---
>
> Key: YARN-5561
> URL: https://issues.apache.org/jira/browse/YARN-5561
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelinereader
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
> Attachments: YARN-5561.patch, YARN-5561.v0.patch
>
>
> ATSv2 model lacks retrieval of {{list-of-all-apps}}, 
> {{list-of-all-app-attempts}} and {{list-of-all-containers-per-attempt}} via 
> REST API's. And also it is required to know about all the entities in an 
> applications.
> It is pretty much highly required these URLs for Web  UI.
> New REST URL would be 
> # GET {{/ws/v2/timeline/apps}}
> # GET {{/ws/v2/timeline/apps/\{app-id\}/appattempts}}.
> # GET 
> {{/ws/v2/timeline/apps/\{app-id\}/appattempts/\{attempt-id\}/containers}}
> # GET {{/ws/v2/timeline/apps/\{app id\}/entities}} should display list of 
> entities that can be queried.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

2016-09-11 Thread Jian He (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian He updated YARN-5621:
--
Attachment: YARN-5621.5.patch

fixed some warnings

> Support LinuxContainerExecutor to create symlinks for continuously localized 
> resources
> --
>
> Key: YARN-5621
> URL: https://issues.apache.org/jira/browse/YARN-5621
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Jian He
>Assignee: Jian He
> Attachments: YARN-5621.1.patch, YARN-5621.2.patch, YARN-5621.3.patch, 
> YARN-5621.4.patch, YARN-5621.5.patch
>
>
> When new resources are localized, new symlink needs to be created for the 
> localized resource. This is the change for the LinuxContainerExecutor to 
> create the symlinks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-5636) Support reserving resources on certain nodes for certain applications

2016-09-11 Thread Tao Jie (JIRA)

Tao Jie created YARN-5636:
-

 Summary: Support reserving resources on certain nodes for certain 
applications
 Key: YARN-5636
 URL: https://issues.apache.org/jira/browse/YARN-5636
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: scheduler
Reporter: Tao Jie


We have met such circumstance:
We are trying to run storm&kafka on yarn by Slider, and Storm&Kafka writes data 
to local disk on node. If some containers or the application fails, we expect 
that those containers would restart on the same node as they run before, 
otherwise data written on local would lost.
For slider, it will trying to ensure restarted container on same nodes as 
before. However for yarn, resource may be assigned to other applications when 
former long-running application is down.
As a result we'd better to have a mechanism that reserve some resource for 
certain long-running applications on certain nodes for a period of time. Does 
it make sense?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

2016-09-11 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482168#comment-15482168
 ] 

Naganarasimha G R commented on YARN-5545:
-

Thanks [~bibinchundatt], for the patch .
Few points to discuss on the approach
# Would it be good to have a separate queue partition based max application 
limit similar to {{yarn.scheduler.capacity..maximum-applications}} 
so that there is finer control on logical partitions similar to default 
partition ?
# Would it be better to set the default value of 
{{yarn.scheduler.capacity.maximum-applications.accessible-node-labels.}} 
to that of {{yarn.scheduler.capacity.maximum-applications}}, it will make the 
work of the admin much easier. similarly we can decide the same for the 
previous point if we plan to adopt it.
# IIUC you seem to adopt the approach little different than what you mention in 
your 
[comment|https://issues.apache.org/jira/browse/YARN-5545?focusedCommentId=15453163&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15453163],
 though we are having per partition level max app limit, we just sum up max 
limits of all partitions under a queue and check against 
{{ApplicationLimit.getAllMaxApplication()}}. If we were to not actually 
validate against per Queue's PartitionLevelMaxApps then why need to come up 
with a new configuration? Also consider the cases when the accessibility is * 
and new partitions are added {{without refreshing}}, this configuration will be 
wrong as its static.
# Need to take care of documentation which i think is missed for 
*MaximumAMResourcePercentPerPartition* too, May be can be handled in a 
different jira 

> App submit failure on queue with label when default queue partition capacity 
> is zero
> 
>
> Key: YARN-5545
> URL: https://issues.apache.org/jira/browse/YARN-5545
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
> Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, 
> YARN-5545.0003.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar 
> ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
>  sleep -Dmapreduce.job.node-label-expression=labelx 
> -Dmapreduce.job.queuename=default -m 1 -r 1 -mt 1000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging 
> area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed 
> to submit application_1471670113386_0001 to YARN : 
> org.apache.hadoop.security.AccessControlException: Queue root.default already 
> has 0 applications, cannot accept submission of application: 
> application_1471670113386_0001
>   at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
>   at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
>   at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
>   at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
>   at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
>   at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>   at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
>   at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>   at 
> org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
>   at 
> org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:

[jira] [Commented] (YARN-4232) TopCLI console support for HA mode

2016-09-11 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481988#comment-15481988
 ] 

Naganarasimha G R commented on YARN-4232:
-

Thanks for the patch and also additionally supporting for fetching "uptime" 
from active RM. 
Overall the patch LGTM. If no other comments will commit it tomorrow. 
[~vvasudev], wanted to take a final look at the patch ?
 

> TopCLI console support for HA mode
> --
>
> Key: YARN-4232
> URL: https://issues.apache.org/jira/browse/YARN-4232
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4232.patch, 0002-YARN-4232.patch, 
> YARN-4232.003.patch, YARN-4232.004.patch, YARN-4232.005.patch
>
>
> *Steps to reproduce*
> Start Top command in YARN in HA mode
> ./yarn top
> {noformat}
> usage: yarn top
>  -cols  Number of columns on the terminal
>  -delay The refresh delay(in seconds), default is 3 seconds
>  -help   Print usage; for help while the tool is running press 'h'
>  + Enter
>  -queuesComma separated list of queues to restrict applications
>  -rows  Number of rows on the terminal
>  -types Comma separated list of types to restrict applications,
>  case sensitive(though the display is lower case)
>  -users Comma separated list of users to restrict applications
> {noformat}
> Execute *for help while the tool is running press 'h'  + Enter* while top 
> tool is running
> Exception is thrown in console continuously
> {noformat}
> 15/10/07 14:59:28 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at java.net.Socket.connect(Socket.java:538)
> at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
> at sun.net.www.http.HttpClient.(HttpClient.java:211)
> at sun.net.www.http.HttpClient.New(HttpClient.java:308)
> at sun.net.www.http.HttpClient.New(HttpClient.java:326)
> at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168)
> at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104)
> at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998)
> at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932)
> at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:742)
> at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:467)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:420)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-4232) TopCLI console support for HA mode

2016-09-11 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481988#comment-15481988
 ] 

Naganarasimha G R edited comment on YARN-4232 at 9/11/16 4:17 PM:
--

Thanks for the patch and also additionally supporting for fetching "uptime" 
from active RM. 
Overall the patch LGTM. If no other comments will commit it tomorrow. 
[~vvasudev], want to take a final look at the patch ?
 


was (Author: naganarasimha):
Thanks for the patch and also additionally supporting for fetching "uptime" 
from active RM. 
Overall the patch LGTM. If no other comments will commit it tomorrow. 
[~vvasudev], wanted to take a final look at the patch ?
 

> TopCLI console support for HA mode
> --
>
> Key: YARN-4232
> URL: https://issues.apache.org/jira/browse/YARN-4232
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Bibin A Chundatt
>Assignee: Bibin A Chundatt
>Priority: Minor
> Attachments: 0001-YARN-4232.patch, 0002-YARN-4232.patch, 
> YARN-4232.003.patch, YARN-4232.004.patch, YARN-4232.005.patch
>
>
> *Steps to reproduce*
> Start Top command in YARN in HA mode
> ./yarn top
> {noformat}
> usage: yarn top
>  -cols  Number of columns on the terminal
>  -delay The refresh delay(in seconds), default is 3 seconds
>  -help   Print usage; for help while the tool is running press 'h'
>  + Enter
>  -queuesComma separated list of queues to restrict applications
>  -rows  Number of rows on the terminal
>  -types Comma separated list of types to restrict applications,
>  case sensitive(though the display is lower case)
>  -users Comma separated list of users to restrict applications
> {noformat}
> Execute *for help while the tool is running press 'h'  + Enter* while top 
> tool is running
> Exception is thrown in console continuously
> {noformat}
> 15/10/07 14:59:28 ERROR cli.TopCLI: Could not fetch RM start time
> java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:204)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at java.net.Socket.connect(Socket.java:538)
> at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
> at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
> at sun.net.www.http.HttpClient.(HttpClient.java:211)
> at sun.net.www.http.HttpClient.New(HttpClient.java:308)
> at sun.net.www.http.HttpClient.New(HttpClient.java:326)
> at 
> sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168)
> at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104)
> at 
> sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998)
> at 
> sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932)
> at 
> org.apache.hadoop.yarn.client.cli.TopCLI.getRMStartTime(TopCLI.java:742)
> at org.apache.hadoop.yarn.client.cli.TopCLI.run(TopCLI.java:467)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
> at org.apache.hadoop.yarn.client.cli.TopCLI.main(TopCLI.java:420)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

2016-09-11 Thread Naganarasimha G R (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481979#comment-15481979
 ] 

Naganarasimha G R commented on YARN-5567:
-

[~aw], I understand that with typo in the health check script can bring down 
the whole cluster hence we need to revert this, but at the same time with 
erroneous script there could be possibility that the script missed to detect 
some health check failures on the node ?
Should we think of some other state which could warn the admin about this(which 
is captured in webui/Rest) ?

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be 
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2016-09-11 Thread Kai Sasaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5148:
-
Attachment: Screen Shot 2016-09-11 at 23.28.31.png

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, yarn-conf.png, 
> yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

2016-09-11 Thread Kai Sasaki (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated YARN-5148:
-
Attachment: YARN-5148-YARN-3368.02.patch

> [YARN-3368] Add page to new YARN UI to view server side 
> configurations/logs/JVM-metrics
> ---
>
> Key: YARN-5148
> URL: https://issues.apache.org/jira/browse/YARN-5148
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Wangda Tan
>Assignee: Kai Sasaki
> Attachments: Screen Shot 2016-09-11 at 23.28.31.png, 
> YARN-5148-YARN-3368.01.patch, YARN-5148-YARN-3368.02.patch, yarn-conf.png, 
> yarn-tools.png
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

2016-09-11 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481755#comment-15481755
 ] 

Allen Wittenauer commented on YARN-5567:


I'm going to mark this as fixed so that the release notes for alpha1 reflect 
that this change is present in it.  I've open and closed YARN-5635 so that 
alpha2's release notes reflect this change being reverted.

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be 
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

2016-09-11 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481755#comment-15481755
 ] 

Allen Wittenauer edited comment on YARN-5567 at 9/11/16 1:25 PM:
-

I'm going to mark this as fixed so that the release notes for alpha1 reflect 
that this change is present in it.  I've opened and closed YARN-5635 so that 
alpha2's release notes reflect this change being reverted.


was (Author: aw):
I'm going to mark this as fixed so that the release notes for alpha1 reflect 
that this change is present in it.  I've open and closed YARN-5635 so that 
alpha2's release notes reflect this change being reverted.

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be 
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

2016-09-11 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved YARN-5567.

Resolution: Fixed

> Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus
> --
>
> Key: YARN-5567
> URL: https://issues.apache.org/jira/browse/YARN-5567
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: nodemanager
>Affects Versions: 2.8.0, 3.0.0-alpha1
>Reporter: Yufei Gu
>Assignee: Yufei Gu
> Fix For: 3.0.0-alpha1
>
> Attachments: YARN-5567.001.patch
>
>
> In case of FAILED_WITH_EXIT_CODE, health status should be false.
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(true, "", now);
> break;
> {code}
> should be 
> {code}
>   case FAILED_WITH_EXIT_CODE:
> setHealthStatus(false, "", now);
> break;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Resolved] (YARN-5635) Revert YARN-5567

2016-09-11 Thread Allen Wittenauer (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved YARN-5635.

   Resolution: Fixed
Fix Version/s: 3.0.0-alpha2

> Revert YARN-5567
> 
>
> Key: YARN-5635
> URL: https://issues.apache.org/jira/browse/YARN-5635
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
> Fix For: 3.0.0-alpha2
>
>
> YARN-5567 needs to be reverted.  The exit code is intentionally ignored since 
> it will cause mass destruction of Apache Hadoop clusters in the situation 
> there is a typo in the health script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5635) Revert YARN-5567

2016-09-11 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/YARN-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481753#comment-15481753
 ] 

Allen Wittenauer commented on YARN-5635:


YARN-5567 has already been reverted.

> Revert YARN-5567
> 
>
> Key: YARN-5635
> URL: https://issues.apache.org/jira/browse/YARN-5635
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Allen Wittenauer
>Assignee: Allen Wittenauer
> Fix For: 3.0.0-alpha2
>
>
> YARN-5567 needs to be reverted.  The exit code is intentionally ignored since 
> it will cause mass destruction of Apache Hadoop clusters in the situation 
> there is a typo in the health script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-5635) Revert YARN-5567

2016-09-11 Thread Allen Wittenauer (JIRA)

Allen Wittenauer created YARN-5635:
--

 Summary: Revert YARN-5567
 Key: YARN-5635
 URL: https://issues.apache.org/jira/browse/YARN-5635
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Allen Wittenauer
Assignee: Allen Wittenauer


YARN-5567 needs to be reverted.  The exit code is intentionally ignored since 
it will cause mass destruction of Apache Hadoop clusters in the situation there 
is a typo in the health script.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

[jira] [Commented] (YARN-5631) Missing refreshClusterMaxPriority usage in rmadmin help message

[jira] [Commented] (YARN-4855) Should check if node exists when replace nodelabels

[jira] [Updated] (YARN-5587) Add support for resource profiles

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

[jira] [Commented] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

[jira] [Commented] (YARN-5586) Update the Resources class to consider all resource types

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

[jira] [Commented] (YARN-5620) Core changes in NodeManager to support for upgrade and rollback of Containers

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

[jira] [Commented] (YARN-3692) Allow REST API to set a user generated message when killing an application

[jira] [Comment Edited] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

[jira] [Updated] (YARN-3692) Allow REST API to set a user generated message when killing an application

[jira] [Commented] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

[jira] [Commented] (YARN-5561) [Atsv2] : Support for ability to retrieve apps/app-attempt/containers and entities via REST

[jira] [Updated] (YARN-5621) Support LinuxContainerExecutor to create symlinks for continuously localized resources

[jira] [Created] (YARN-5636) Support reserving resources on certain nodes for certain applications

[jira] [Commented] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero

[jira] [Commented] (YARN-4232) TopCLI console support for HA mode

[jira] [Comment Edited] (YARN-4232) TopCLI console support for HA mode

[jira] [Commented] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

[jira] [Updated] (YARN-5148) [YARN-3368] Add page to new YARN UI to view server side configurations/logs/JVM-metrics

[jira] [Commented] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

[jira] [Comment Edited] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

[jira] [Resolved] (YARN-5567) Fix script exit code checking in NodeHealthScriptRunner#reportHealthStatus

[jira] [Resolved] (YARN-5635) Revert YARN-5567

[jira] [Commented] (YARN-5635) Revert YARN-5567

[jira] [Created] (YARN-5635) Revert YARN-5567

32 matches

Site Navigation

Mail list logo

Footer information