[jira] [Created] (YARN-4702) FairScheduler: Allow setting maxResources for ad hoc queues

2016-02-17 Thread Karthik Kambatla (JIRA)
Karthik Kambatla created YARN-4702:
--

 Summary: FairScheduler: Allow setting maxResources for ad hoc 
queues
 Key: YARN-4702
 URL: https://issues.apache.org/jira/browse/YARN-4702
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: fairscheduler
Affects Versions: 2.8.0
Reporter: Karthik Kambatla
Assignee: Karthik Kambatla


FairScheduler allows creating queues on the fly. Unlike other queues, one 
cannot set maxResources for these queues since they don't show up in the xml 
config. Adding a default maxResources for ad hoc pools would help. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4690) Skip object allocation in FSAppAttempt#getResourceUsage when possible

2016-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151747#comment-15151747
 ] 

Hudson commented on YARN-4690:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9319 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9319/])
YARN-4690. Skip object allocation in FSAppAttempt#getResourceUsage when (sjlee: 
rev 7de70680fe44967e2afc92ba4c92f8e7afa7b151)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java
* hadoop-yarn-project/CHANGES.txt


> Skip object allocation in FSAppAttempt#getResourceUsage when possible
> -
>
> Key: YARN-4690
> URL: https://issues.apache.org/jira/browse/YARN-4690
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5
>
> Attachments: YARN-4690.patch
>
>
> YARN-2768 addresses an important bottleneck. Here is another similar instance 
> where object allocation in Resources#subtract will slow down the fair 
> scheduler's event processing thread.
> {noformat}
> org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java)
> org.apache.hadoop.yarn.util.Records.newRecord(Records.java)
> 
> org.apache.hadoop.yarn.util.resource.Resources.createResource(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.subtract(Resources.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getResourceUsage(FSAppAttempt.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> java.util.TimSort.binarySort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.Arrays.sort(Arrays.java)
> java.util.Collections.sort(Collections.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> {noformat}
> One way to fix it is to return {{getCurrentConsumption()}} if there is no 
> preemption which is the normal case. This means {{getResourceUsage}} method 
> will return reference to {{FSAppAttempt}}'s internal resource object. But 
> that should be ok as {{getResourceUsage}} doesn't expect the caller to modify 
> the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4690) Skip object allocation in FSAppAttempt#getResourceUsage when possible

2016-02-17 Thread Sangjin Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151727#comment-15151727
 ] 

Sangjin Lee commented on YARN-4690:
---

Sounds good. Hadn't checked what {{getPreemptedResources()}} does.

I'm also +1. I'll commit it shortly.

> Skip object allocation in FSAppAttempt#getResourceUsage when possible
> -
>
> Key: YARN-4690
> URL: https://issues.apache.org/jira/browse/YARN-4690
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: YARN-4690.patch
>
>
> YARN-2768 addresses an important bottleneck. Here is another similar instance 
> where object allocation in Resources#subtract will slow down the fair 
> scheduler's event processing thread.
> {noformat}
> org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java)
> org.apache.hadoop.yarn.util.Records.newRecord(Records.java)
> 
> org.apache.hadoop.yarn.util.resource.Resources.createResource(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.subtract(Resources.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getResourceUsage(FSAppAttempt.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> java.util.TimSort.binarySort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.Arrays.sort(Arrays.java)
> java.util.Collections.sort(Collections.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> {noformat}
> One way to fix it is to return {{getCurrentConsumption()}} if there is no 
> preemption which is the normal case. This means {{getResourceUsage}} method 
> will return reference to {{FSAppAttempt}}'s internal resource object. But 
> that should be ok as {{getResourceUsage}} doesn't expect the caller to modify 
> the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4569) Remove incorrect part of maxResources in FairScheduler documentation

2016-02-17 Thread Ray Chiang (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151679#comment-15151679
 ] 

Ray Chiang commented on YARN-4569:
--

Thanks for the commit!

> Remove incorrect part of maxResources in FairScheduler documentation
> 
>
> Key: YARN-4569
> URL: https://issues.apache.org/jira/browse/YARN-4569
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Fix For: 2.9.0
>
> Attachments: YARN-4569.001.patch
>
>
> The maxResources property states:
> {panel}
> For the single-resource fairness policy, the vcores value is ignored.
> {panel}
> This is not correct and should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4569) Remove incorrect part of maxResources in FairScheduler documentation

2016-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151645#comment-15151645
 ] 

Hudson commented on YARN-4569:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9318 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9318/])
YARN-4569. Remove incorrect part of maxResources in FairScheduler (kasha: rev 
a0c95b5fc4c90ee3383619156619a66dfba889f7)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/FairScheduler.md
* hadoop-yarn-project/CHANGES.txt


> Remove incorrect part of maxResources in FairScheduler documentation
> 
>
> Key: YARN-4569
> URL: https://issues.apache.org/jira/browse/YARN-4569
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: YARN-4569.001.patch
>
>
> The maxResources property states:
> {panel}
> For the single-resource fairness policy, the vcores value is ignored.
> {panel}
> This is not correct and should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4569) Remove incorrect part of maxResources in FairScheduler documentation

2016-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151617#comment-15151617
 ] 

Karthik Kambatla commented on YARN-4569:


+1. Checking this in. 

> Remove incorrect part of maxResources in FairScheduler documentation
> 
>
> Key: YARN-4569
> URL: https://issues.apache.org/jira/browse/YARN-4569
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Ray Chiang
>Assignee: Ray Chiang
>  Labels: supportability
> Attachments: YARN-4569.001.patch
>
>
> The maxResources property states:
> {panel}
> For the single-resource fairness policy, the vcores value is ignored.
> {panel}
> This is not correct and should be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4689) FairScheduler: Cleanup preemptContainer to be more readable

2016-02-17 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151601#comment-15151601
 ] 

Hudson commented on YARN-4689:
--

FAILURE: Integrated in Hadoop-trunk-Commit #9317 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/9317/])
YARN-4689. FairScheduler: Cleanup preemptContainer to be more readable. (kasha: 
rev 2ab4c476ed22d3ccf15b215710b67e32bbc7e5f0)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSParentQueue.java
* hadoop-yarn-project/CHANGES.txt
YARN-4689. Fix up CHANGES.txt (kasha: rev 
1c248ea4a8442bccc78a8ce4539ce1f4b6d6b0ee)
* hadoop-yarn-project/CHANGES.txt


> FairScheduler: Cleanup preemptContainer to be more readable
> ---
>
> Key: YARN-4689
> URL: https://issues.apache.org/jira/browse/YARN-4689
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Reporter: Karthik Kambatla
>Assignee: Kai Sasaki
>Priority: Trivial
>  Labels: newbie
> Attachments: YARN-4689.01.patch
>
>
> In FairScheduler#preemptContainer, we check if a queue is preemptable. The 
> code there could be cleaner if we don't use continue, but just the plain old 
> if-else. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (YARN-4333) Fair scheduler should support preemption within queue

2016-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151595#comment-15151595
 ] 

Karthik Kambatla edited comment on YARN-4333 at 2/18/16 2:14 AM:
-

Thanks a lot for filing this JIRA and working on it. The preemption logic is 
broken at different places. We have other JIRAs where preemption between 
sibling queues doesn't work well. 

Quickly skimmed through the patch here. Few high-level comments on the patch 
and preemption in general:
# Today, all preemption is top-down. When we encounter a {{Schedulable}} 
(app/queue) that is starved, we should trigger preemption from the closest 
parent that is not starved. Preempting resources from another application under 
the same leaf-queue that is over its fairshare is automatically taken care of.
# The preemptionTimeout and preemptionThreshold are defined for queues and not 
apps. We should decide on what values we want to use for apps. May be, our best 
bet is to use the ones the corresponding leaf-queue uses? 

[~ashwinshankar77], [~sandyr], [~asuresh] - would like to hear your thoughts as 
well. 


was (Author: kasha):
Thanks a lot for filing this JIRA and working on it. The preemption logic is 
broken at different places. We have other JIRAs where preemption between 
sibling queues doesn't work well. 

Quickly skimmed through the patch here. Few high-level comments on the patch 
and preemption in general:
# Today, all preemption is top-down. When we encounter a {{Schedulable}} 
(app/queue) that is starved, we should trigger preemption from the closest 
parent that is not starved. Preempting resources from another application under 
the same leaf-queue that is over its fairshare is automatically taken care of.
# The preemptionTimeout and preemptionThreshold are defined for queues and not 
apps. We should decide on what values we want to use for apps. May be, our best 
bet is to use the ones the corresponding leaf-queue uses? 

[~ashwinshankar77], [~sandyr] - would like to hear your thoughts as well. 

> Fair scheduler should support preemption within queue
> -
>
> Key: YARN-4333
> URL: https://issues.apache.org/jira/browse/YARN-4333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-4333.001.patch, YARN-4333.002.patch, 
> YARN-4333.003.patch
>
>
> Now each app in fair scheduler is allocated its fairshare, however  fairshare 
> resource is not ensured even if fairSharePreemption is enabled.
> Consider: 
> 1, When the cluster is idle, we submit app1 to queueA,which takes maxResource 
> of queueA.  
> 2, Then the cluster becomes busy, but app1 does not release any resource, 
> queueA resource usage is over its fairshare
> 3, Then we submit app2(maybe with higher priority) to queueA. Now app2 has 
> its own fairshare, but could not obtain any resource, since queueA is still 
> over its fairshare and resource will not assign to queueA anymore. Also, 
> preemption is not triggered in this case.
> So we should allow preemption within queue, when app is starved for fairshare.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4333) Fair scheduler should support preemption within queue

2016-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151595#comment-15151595
 ] 

Karthik Kambatla commented on YARN-4333:


Thanks a lot for filing this JIRA and working on it. The preemption logic is 
broken at different places. We have other JIRAs where preemption between 
sibling queues doesn't work well. 

Quickly skimmed through the patch here. Few high-level comments on the patch 
and preemption in general:
# Today, all preemption is top-down. When we encounter a {{Schedulable}} 
(app/queue) that is starved, we should trigger preemption from the closest 
parent that is not starved. Preempting resources from another application under 
the same leaf-queue that is over its fairshare is automatically taken care of.
# The preemptionTimeout and preemptionThreshold are defined for queues and not 
apps. We should decide on what values we want to use for apps. May be, our best 
bet is to use the ones the corresponding leaf-queue uses? 

[~ashwinshankar77], [~sandyr] - would like to hear your thoughts as well. 

> Fair scheduler should support preemption within queue
> -
>
> Key: YARN-4333
> URL: https://issues.apache.org/jira/browse/YARN-4333
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: fairscheduler
>Affects Versions: 2.6.0
>Reporter: Tao Jie
>Assignee: Tao Jie
> Attachments: YARN-4333.001.patch, YARN-4333.002.patch, 
> YARN-4333.003.patch
>
>
> Now each app in fair scheduler is allocated its fairshare, however  fairshare 
> resource is not ensured even if fairSharePreemption is enabled.
> Consider: 
> 1, When the cluster is idle, we submit app1 to queueA,which takes maxResource 
> of queueA.  
> 2, Then the cluster becomes busy, but app1 does not release any resource, 
> queueA resource usage is over its fairshare
> 3, Then we submit app2(maybe with higher priority) to queueA. Now app2 has 
> its own fairshare, but could not obtain any resource, since queueA is still 
> over its fairshare and resource will not assign to queueA anymore. Also, 
> preemption is not triggered in this case.
> So we should allow preemption within queue, when app is starved for fairshare.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4690) Skip object allocation in FSAppAttempt#getResourceUsage when possible

2016-02-17 Thread Karthik Kambatla (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151571#comment-15151571
 ] 

Karthik Kambatla commented on YARN-4690:


The optimization might not help since the scope of the cached variable is 
limited to {{getResourceUsage()}} which does nothing else. Also, 
{{getPremptedResources()}} isn't heavy, it is just the accessor for 
{{preemptedResources}}.

I am +1 on the patch posted here. [~sjlee0] - if you are fine with it, feel 
free to go ahead and commit it. 

> Skip object allocation in FSAppAttempt#getResourceUsage when possible
> -
>
> Key: YARN-4690
> URL: https://issues.apache.org/jira/browse/YARN-4690
> Project: Hadoop YARN
>  Issue Type: Improvement
>Reporter: Ming Ma
>Assignee: Ming Ma
> Attachments: YARN-4690.patch
>
>
> YARN-2768 addresses an important bottleneck. Here is another similar instance 
> where object allocation in Resources#subtract will slow down the fair 
> scheduler's event processing thread.
> {noformat}
> org.apache.hadoop.yarn.factories.impl.pb.RecordFactoryPBImpl.newRecordInstance(RecordFactoryPBImpl.java)
> org.apache.hadoop.yarn.util.Records.newRecord(Records.java)
> 
> org.apache.hadoop.yarn.util.resource.Resources.createResource(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.clone(Resources.java)
> org.apache.hadoop.yarn.util.resource.Resources.subtract(Resources.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSAppAttempt.getResourceUsage(FSAppAttempt.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.getResourceUsage(FSLeafQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy$FairShareComparator.compare(FairSharePolicy.java)
> java.util.TimSort.binarySort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.TimSort.sort(TimSort.java)
> java.util.Arrays.sort(Arrays.java)
> java.util.Collections.sort(Collections.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> 
> org.apache.hadoop.yarn.sls.scheduler.ResourceSchedulerWrapper.handle(ResourceSchedulerWrapper.java)
> {noformat}
> One way to fix it is to return {{getCurrentConsumption()}} if there is no 
> preemption which is the normal case. This means {{getResourceUsage}} method 
> will return reference to {{FSAppAttempt}}'s internal resource object. But 
> that should be ok as {{getResourceUsage}} doesn't expect the caller to modify 
> the object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4701) When task logs are not available, port 8041 is referenced instead of port 8042

2016-02-17 Thread Haibo Chen (JIRA)
Haibo Chen created YARN-4701:


 Summary: When task logs are not available, port 8041 is referenced 
instead of port 8042
 Key: YARN-4701
 URL: https://issues.apache.org/jira/browse/YARN-4701
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn
Reporter: Haibo Chen
Assignee: Haibo Chen


Accessing logs for an Oozie task attempt in the workflow tool in Hue shows 
"Logs not available for attempt_1433822010707_0001_m_00_0. Aggregation may 
not be complete, Check back later or try the nodemanager at 
quickstart.cloudera:8041"
However the nodemanager http port is 8042, not 8041. Accessing port 8041 shows 
"It looks like you are making an HTTP request to a Hadoop IPC port. This is not 
the correct port for the web interface on this daemon."

To users of Hue this is not particularly helpful without a http link. Can we 
provide a link to the task logs in 
"http://node_manager_host_address:8042/node/application/" here as well as 
the current message to assist users in Hue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4168) Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing

2016-02-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli updated YARN-4168:
--
Target Version/s: 2.8.0

> Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing
> --
>
> Key: YARN-4168
> URL: https://issues.apache.org/jira/browse/YARN-4168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: YARN-4168.1.patch
>
>
> {{TestLogAggregationService.testLocalFileDeletionOnDiskFull}} failing on 
> [Jenkins build 
> 1136|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk/1136/testReport/junit/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestLogAggregationService/testLocalFileDeletionOnDiskFull/]
> {code}
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.verifyLocalFileDeletion(TestLogAggregationService.java:229)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionOnDiskFull(TestLogAggregationService.java:285)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (YARN-4168) Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing

2016-02-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli reopened YARN-4168:
---
  Assignee: Takashi Ohnishi

Thinking more, I'll close YARN-1978 as a dup of this as this patch is much 
closer to the final solution.

[~bwtakacy], can you also do the following?
 - To my comment, I don't see why delSvrc should even be stopped - which was 
causing the tasks to get cancelled. I think we should simply remove the 
delSvrc.stop() call in verifyLocalFileDeletion() together with the existence 
checks you already added.
 - Can you fix the formatting in the patch? The indentation is not consistent 
in the new code.

> Test TestLogAggregationService.testLocalFileDeletionOnDiskFull failing
> --
>
> Key: YARN-4168
> URL: https://issues.apache.org/jira/browse/YARN-4168
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 3.0.0
> Environment: Jenkins
>Reporter: Steve Loughran
>Assignee: Takashi Ohnishi
>Priority: Critical
> Attachments: YARN-4168.1.patch
>
>
> {{TestLogAggregationService.testLocalFileDeletionOnDiskFull}} failing on 
> [Jenkins build 
> 1136|https://builds.apache.org/view/H-L/view/Hadoop/job/Hadoop-Yarn-trunk/1136/testReport/junit/org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation/TestLogAggregationService/testLocalFileDeletionOnDiskFull/]
> {code}
> {noformat}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertFalse(Assert.java:64)
>   at org.junit.Assert.assertFalse(Assert.java:74)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.verifyLocalFileDeletion(TestLogAggregationService.java:229)
>   at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionOnDiskFull(TestLogAggregationService.java:285)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (YARN-1978) TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes

2016-02-17 Thread Vinod Kumar Vavilapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kumar Vavilapalli resolved YARN-1978.
---
Resolution: Duplicate

Closing this as a dup of YARN-4168 which is much closer to completion.

> TestLogAggregationService#testLocalFileDeletionAfterUpload fails sometimes
> --
>
> Key: YARN-1978
> URL: https://issues.apache.org/jira/browse/YARN-1978
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: YARN-1978.txt
>
>
> This happens in a Windows VM, though the issue isn't related to Windows.
> {code}
> ---
> Test set: 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
> ---
> Tests run: 11, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 6.859 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService
> testLocalFileDeletionAfterUpload(org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService)
>   Time elapsed: 0.906 sec  <<< FAILURE!
> junit.framework.AssertionFailedError: check 
> Y:\hadoop-yarn-project\hadoop-yarn\hadoop-yarn-server\hadoop-yarn-server-nodemanager\target\TestLogAggregationService-localLogDir\application_1234_0001\container_1234_0001_01_01\stdout
> at junit.framework.Assert.fail(Assert.java:50)
> at junit.framework.Assert.assertTrue(Assert.java:20)
> at junit.framework.Assert.assertFalse(Assert.java:34)
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.TestLogAggregationService.testLocalFileDeletionAfterUpload(TestLogAggregationService.java:201)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4692) [Umbrella] Simplified and first-class support for services in YARN

2016-02-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151477#comment-15151477
 ] 

Wangda Tan commented on YARN-4692:
--

Thanks [~vinodkv] and other folks working on this, this documentation is pretty 
comprehensive already, some thoughts/suggestions:

1) For running containers, instead of classifying them into service/batch, I 
would prefer to tag them by application priority. For example, 0 is production 
service tasks, 5 is batch job, etc. The reason is
- Service container is not always important than other containers
- One important service can preempt containers from less important services.

2) A container is service or batch depends on duration of the task, we had lots 
of discussions on YARN-1039 already.

3) For 3.2.2 container auto restart, beyond restart container when it dies, we 
could let framework check health of running tasks. For example, support embeded 
REST API to get healthy status of containers. With this, framework can restart 
malfunctioning containers.

4) For 3.2.7 Scheduling / Queue model
Beyond queue model, we should consider long running containers when reserving 
large container on node.

5) Debuggability for service container is also very important,
- Tools similar to [cAdvisor|https://github.com/google/cadvisor] could be very 
helpful to figure out issues of service tasks
- We also need tool to show aggregated scheduling-related information of 
apps/queues/cluster.

*For comments from [~asuresh]:*
bq. we can give applications the ability to specify Preemptability of 
containers in a particular role...
Instead of adding a new field, I think we can reuse container priority and 
application priority to describe preemptability.

bq. Allow LR Applications to specify peak, min and variance/mean (also many 
transient and steady-state) of a Resource request to allow schedulers to make 
better allocation decisions.
I think this is hard for end user to know. Our framework should be able to 
figure out such metrics for running containers. For requested new containers, 
we'd better assume they will use 100% of requested resources.

bq. In YARN-4597 Chris Douglas proposed ...
In my mind, YARN-4597 is targeted to solve low latency batch tasks, if service 
tasks running for one hour or more, it's not a big deal to take several minutes 
to setup it.

And agree that reservation system (YARN-1051) is the utimate solution of queue 
model and container allocation for services

> [Umbrella] Simplified and first-class support for services in YARN
> --
>
> Key: YARN-4692
> URL: https://issues.apache.org/jira/browse/YARN-4692
> Project: Hadoop YARN
>  Issue Type: New Feature
>Reporter: Vinod Kumar Vavilapalli
>Assignee: Vinod Kumar Vavilapalli
> Attachments: 
> YARN-First-Class-And-Simplified-Support-For-Services-v0.pdf
>
>
> YARN-896 focused on getting the ball rolling on the support for services 
> (long running applications) on YARN.
> I’d like propose the next stage of this effort: _Simplified and first-class 
> support for services in YARN_.
> The chief rationale for filing a separate new JIRA is threefold:
>  - Do a fresh survey of all the things that are already implemented in the 
> project
>  - Weave a comprehensive story around what we further need and attempt to 
> rally the community around a concrete end-goal, and
>  - Additionally focus on functionality that YARN-896 and friends left for 
> higher layers to take care of and see how much of that is better integrated 
> into the YARN platform itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM

2016-02-17 Thread Li Lu (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151422#comment-15151422
 ] 

Li Lu commented on YARN-4696:
-

Thanks [~ste...@apache.org]! This new override based solution works better 
IMHO. Some minor questions:

- Since we have already switched to the override based approach, do we still 
need to introduce the new config key? I guess not, but just to make sure this 
because we're still changing yarn-default? 
- I noticed the following lines of comments:
{code}
* If they return null, then {@link #getAppState(ApplicationId)} will
* also need to be reworked.
{code}
This coupling appears to be a little bit error-pruning, since people may easily 
ignore this comment. Do you think it worth the effort to encapsulate 
getAppState into an interface inside the storage? We may also want to move the 
yarnClient there so that the semantic is completely isolated from the actual 
implementation. For UTs we can easily build a trivial app state fetcher that 
does nothing. Not sure if this is an over-design though...

> EntityGroupFSTimelineStore to work in the absence of an RM
> --
>
> Key: YARN-4696
> URL: https://issues.apache.org/jira/browse/YARN-4696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-4696-001.patch, YARN-4696-002.patch
>
>
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the 
> configuration pointing to it. This is a new change, and impacts testing where 
> you have historically been able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is 
> running; it falls back to "unknown" if not. If the RM connection was 
> optional, the "unknown" codepath could be called directly, relying on age of 
> file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to 0.0.0.0
> # disable retries on yarn client IPC; if it fails, tag app as unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking

2016-02-17 Thread Daniel Zhi (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151397#comment-15151397
 ] 

Daniel Zhi commented on YARN-4676:
--

I am puzzled about the these 5 timeout junit tests 
(org.apache.hadoop.http.TestHttpServerLifecycle , 
org.apache.hadoop.yarn.client.cli.TestYarnCLI etc). All of them PASS on my 
local machine (with or without my patch) and I don't my code change in the 
patch could cause the timeout error --- I am not sure the timeout errors are 
related to my patch and how to fix them.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> 
>
> Key: YARN-4676
> URL: https://issues.apache.org/jira/browse/YARN-4676
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 2.8.0
>Reporter: Daniel Zhi
>Assignee: Daniel Zhi
>  Labels: features
> Attachments: GracefulDecommissionYarnNode.pdf, HADOOP-4676.004.patch, 
> HADOOP-4676.005.patch
>
>
> DecommissioningNodeWatcher inside ResourceTrackingService tracks 
> DECOMMISSIONING nodes status automatically and asynchronously after 
> client/admin made the graceful decommission request. It tracks 
> DECOMMISSIONING nodes status to decide when, after all running containers on 
> the node have completed, will be transitioned into DECOMMISSIONED state. 
> NodesListManager detect and handle include and exclude list changes to kick 
> out decommission or recommission as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4634) Scheduler UI/Metrics need to consider cases like non-queue label mappings

2016-02-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151277#comment-15151277
 ] 

Wangda Tan commented on YARN-4634:
--

[~sunilg], is your patch able to handle following case?
Labels assigned to nodes, labels are non-exclusive, no label assigned to queue. 
In this case, should we should queue hierarchy under labels or not? IAW, queues 
could use resources of other labels even if there's no queue-label mapping.
To me, we can ignore label only when it is a completely "orphan" label, which 
is not assigned to any node and queue.


> Scheduler UI/Metrics need to consider cases like non-queue label mappings
> -
>
> Key: YARN-4634
> URL: https://issues.apache.org/jira/browse/YARN-4634
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 2.7.1
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: 0001-YARN-4634.patch
>
>
> Currently when label-queue mappings are not available, there are few 
> assumptions taken in UI and in metrics.
> In above case where labels are enabled and available in cluster but without 
> any queue mappings, UI displays queues under labels. This is not correct.
> Currently  labels enabled check and availability of labels are considered to 
> render scheduler UI. Henceforth we also need to check whether 
> - queue-mappings are available
> - nodes are mapped with labels with proper exclusivity flags on
> This ticket also will try to see the default configurations in queue when 
> labels are not mapped. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4654) Yarn node label CLI should parse "=" correctly when trying to remove all labels on a node

2016-02-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151135#comment-15151135
 ] 

Wangda Tan commented on YARN-4654:
--

Tried this patch locally, works well for me, +1, thanks 
[~Naganarasimha]/[~rohithsharma].

> Yarn node label CLI should parse "=" correctly when trying to remove all 
> labels on a node
> -
>
> Key: YARN-4654
> URL: https://issues.apache.org/jira/browse/YARN-4654
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-4654.v1.001.patch, YARN-4654.v1.002.patch, 
> YARN-4654.v1.003.patch
>
>
> Currently, when adding labels to nodes, user can run:
> {{yarn rmadmin -replaceLabelsOnNode "host1=x host2=y"}}
> However, when removing labels from a node, user has to run:
> {{yarn rmadmin -replaceLabelsOnNode "host1 host2"}}
> Instead of:
> {{yarn rmadmin -replaceLabelsOnNode "host1= host2="}}
> We should handle both of "=" exists/not-exists case when removing labels on a 
> node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM

2016-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151085#comment-15151085
 ] 

Hadoop QA commented on YARN-4696:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 30s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 
39s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 18s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 3s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
46s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
54s {color} | {color:green} trunk passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 38s 
{color} | {color:red} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage
 in trunk has 1 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 36s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 
31s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 26s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 26s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 4s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 4s 
{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 45s 
{color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 5 new + 
9 unchanged - 0 fixed = 14 total (was 9) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s 
{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 
54s {color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 55s 
{color} | {color:red} 
hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timeline-pluginstorage-jdk1.8.0_72
 with JDK v1.8.0_72 generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7) 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 48s 
{color} | {color:green} hadoop-yarn-common in the patch passed with JDK 
v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 9s 

[jira] [Created] (YARN-4700) ATS storage has one extra record each time the RM got restarted

2016-02-17 Thread Li Lu (JIRA)
Li Lu created YARN-4700:
---

 Summary: ATS storage has one extra record each time the RM got 
restarted
 Key: YARN-4700
 URL: https://issues.apache.org/jira/browse/YARN-4700
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: timelineserver
Affects Versions: YARN-2928
Reporter: Li Lu


When testing the new web UI for ATS v2, I noticed that we're creating one extra 
record for each finished application (but still hold in the RM state store) 
each time the RM got restarted. It's quite possible that we add the cluster 
start timestamp into the default cluster id, thus each time we're creating a 
new record for one application (cluster id is a part of the row key). We need 
to fix this behavior, probably by having a better default cluster id. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-02-17 Thread Wangda Tan (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wangda Tan updated YARN-4108:
-
Attachment: YARN-4108.3.patch

Attached ver.3 patch.

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.2.patch, YARN-4108.3.patch, YARN-4108.poc.1.patch, 
> YARN-4108.poc.2-WIP.patch, YARN-4108.poc.3-WIP.patch, 
> YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request

2016-02-17 Thread Wangda Tan (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15151019#comment-15151019
 ] 

Wangda Tan commented on YARN-4108:
--

Thanks [~sunilg]!

Addressed 1-3. 
For 4. I think it's possible that one queue kill another queue's container, so 
far I didn't see issues here, since allocate container acquires scheduler lock 
and other operations such as moveQueue locks scheduler too. Do you think is 
there any problems here?

> CapacityScheduler: Improve preemption to preempt only those containers that 
> would satisfy the incoming request
> --
>
> Key: YARN-4108
> URL: https://issues.apache.org/jira/browse/YARN-4108
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Wangda Tan
>Assignee: Wangda Tan
> Attachments: YARN-4108-design-doc-V3.pdf, 
> YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf, YARN-4108.1.patch, 
> YARN-4108.2.patch, YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch, 
> YARN-4108.poc.3-WIP.patch, YARN-4108.poc.4-WIP.patch
>
>
> This is sibling JIRA for YARN-2154. We should make sure container preemption 
> is more effective.
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality 
> (I only want to use rack-1) / node-constraints (YARN-3409) / black-list (I 
> don't want to use rack1 and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), 
> cross applicaiton preemption (such as priority-based (YARN-1963) / 
> fairness-based (YARN-3319)).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM

2016-02-17 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated YARN-4696:
-
Attachment: YARN-4696-002.patch

Patch 002

# removes switch in exchange for making creation/use of RM something that can 
be subclassed or mocked away
# switched to CompositeService for automatic handling of child service 
lifecycle; by adding yarnclient & the others they get this lifecycle (and there 
are no need for special yarnClient!=null checks anywhere in the code.
# also cleaned up the {{cacheItem.getStore().close()}} calls -I managed to get 
an NPE if the store was null; they are services so can be handled via 
{{ServiceOperations}}

Finally: when the web API catches an illegal argument exception (or any other), 
the string value is included. This helps track down problems like application 
ID conversion trouble in your plugin, which would otherwise fail with no 
meaningful error messages or stack traces either on the client or the server

> EntityGroupFSTimelineStore to work in the absence of an RM
> --
>
> Key: YARN-4696
> URL: https://issues.apache.org/jira/browse/YARN-4696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Attachments: YARN-4696-001.patch, YARN-4696-002.patch
>
>
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the 
> configuration pointing to it. This is a new change, and impacts testing where 
> you have historically been able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is 
> running; it falls back to "unknown" if not. If the RM connection was 
> optional, the "unknown" codepath could be called directly, relying on age of 
> file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to 0.0.0.0
> # disable retries on yarn client IPC; if it fails, tag app as unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R updated YARN-4697:

Assignee: Haibo Chen  (was: Naganarasimha G R)

> NM aggregation thread pool is not bound by limits
> -
>
> Key: YARN-4697
> URL: https://issues.apache.org/jira/browse/YARN-4697
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4697.001.patch, yarn4697.002.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Naganarasimha G R (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naganarasimha G R reassigned YARN-4697:
---

Assignee: Naganarasimha G R  (was: Haibo Chen)

> NM aggregation thread pool is not bound by limits
> -
>
> Key: YARN-4697
> URL: https://issues.apache.org/jira/browse/YARN-4697
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Naganarasimha G R
> Attachments: yarn4697.001.patch, yarn4697.002.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Haibo Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Chen updated YARN-4697:
-
Attachment: yarn4697.002.patch

> NM aggregation thread pool is not bound by limits
> -
>
> Key: YARN-4697
> URL: https://issues.apache.org/jira/browse/YARN-4697
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4697.001.patch, yarn4697.002.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Haibo Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150864#comment-15150864
 ] 

Haibo Chen commented on YARN-4697:
--

Hi Naganarasimha G R, 

Thanks very much for your comments. I have addressed the threadPool 
accessibility issue and also modified yarn-default.xml to match 
YarnConfiguration. To answer your other comments:

1. Yes, 50 should  be safe. (The default I set is 100). But maybe sometimes 
even 50 threads alone for log aggregation is too much resource dedicated? Some 
users may also want to use more than 50 if they have powerful machines and many 
yarn applications? If this is configurable, users themselves can decide.

2. The purpose of the semaphore is to block the threads in the thread pool 
because the main thread always acquire the semaphore first. Because I set the 
thread pool size to be 1, once that single thread tries to acquire the 
semaphore when it executes either of the two runnable, it blocks and the other 
runnable will not be executed if the thread pool can indeed create only 1 
thread. (If another thread is available in the thread pool, there will be 
another thread blocking on the semaphore, failing the test). The immediate 
release after acquire in runnable is just to safely release the resource. I'll 
try to add comments in the test code.


> NM aggregation thread pool is not bound by limits
> -
>
> Key: YARN-4697
> URL: https://issues.apache.org/jira/browse/YARN-4697
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4697.001.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4699) Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to change label of a node

2016-02-17 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-4699:
--
Attachment: ForLabelX-AfterSwitch.png
ForLabelY-AfterSwitch.png
AfterAppFInish-LabelY-Metrics.png

> Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to 
> change label of a node
> 
>
> Key: YARN-4699
> URL: https://issues.apache.org/jira/browse/YARN-4699
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Affects Versions: 2.7.2
>Reporter: Sunil G
>Assignee: Sunil G
>Priority: Critical
> Attachments: AfterAppFInish-LabelY-Metrics.png, 
> ForLabelX-AfterSwitch.png, ForLabelY-AfterSwitch.png
>
>
> Scenario is as follows:
> a. 2 nodes are available in the cluster (node1 with label "x", node2 with 
> label "y")
> b. Submit an application to node1 for label "x". 
> c. Change node1 label to "y" by using *replaceLabelsOnNode* command.
> d. Verify Scheduler UI for metrics such as "Used Capacity", "Absolute 
> Capacity" etc. "x" still shows some capacity.
> e. Change node1 label back to "x" and verify UI and REST o/p
> Output:
> 1. "Used Capacity", "Absolute Capacity" etc are not decremented once labels 
> is changed for a node.
> 2. UI tab for respective label shows wrong GREEN color in these cases.
> 3. REST o/p is wrong for each label after executing above scenario.
> Attaching screen shots also. This ticket will try to cover UI and REST o/p 
> fix when label is changed runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4699) Scheduler UI and REST o/p is not in sync when -replaceLabelsOnNode is used to change label of a node

2016-02-17 Thread Sunil G (JIRA)
Sunil G created YARN-4699:
-

 Summary: Scheduler UI and REST o/p is not in sync when 
-replaceLabelsOnNode is used to change label of a node
 Key: YARN-4699
 URL: https://issues.apache.org/jira/browse/YARN-4699
 Project: Hadoop YARN
  Issue Type: Bug
  Components: capacity scheduler
Affects Versions: 2.7.2
Reporter: Sunil G
Assignee: Sunil G
Priority: Critical


Scenario is as follows:
a. 2 nodes are available in the cluster (node1 with label "x", node2 with label 
"y")
b. Submit an application to node1 for label "x". 
c. Change node1 label to "y" by using *replaceLabelsOnNode* command.
d. Verify Scheduler UI for metrics such as "Used Capacity", "Absolute Capacity" 
etc. "x" still shows some capacity.
e. Change node1 label back to "x" and verify UI and REST o/p

Output:
1. "Used Capacity", "Absolute Capacity" etc are not decremented once labels is 
changed for a node.
2. UI tab for respective label shows wrong GREEN color in these cases.
3. REST o/p is wrong for each label after executing above scenario.

Attaching screen shots also. This ticket will try to cover UI and REST o/p fix 
when label is changed runtime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
- Containers Running: -19
- Memory Used: -38GB
- Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> - Containers Running: -19
> - Memory Used: -38GB
> - Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150625#comment-15150625
 ] 

Dmytro Kabakchei commented on YARN-4698:


Have anybody else met this issue? Does anybody have any ideas what is the 
reason and how to solve this?

> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Attachment: mitigating2.5.1.diff

> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating2.5.1diff).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating01.patch).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut, mitigating2.5.1.diff
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating2.5.1diff).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Attachment: Example.log-cut

> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating01.patch).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

Some log records related can be found within "Example.log-cut" attachment.

After some investigation we made a conclusion that there is some kind of race 
condition for container that was scheduled for killing, but was completed 
successfully before kill.
Also, there is a patch that is possibly mitigates effects of the issue, but 
doesn't solve original problem (see mitigating01.patch).
Unfortunately, the cluster and all other logs are lost, because the report was 
made about a year ago, but wasn't submitted properly. Also, we don't know if 
the issue exist in other versions.

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
> Attachments: Example.log-cut
>
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times
> Some log records related can be found within "Example.log-cut" attachment.
> After some investigation we made a conclusion that there is some kind of race 
> condition for container that was scheduled for killing, but was completed 
> successfully before kill.
> Also, there is a patch that is possibly mitigates effects of the issue, but 
> doesn't solve original problem (see mitigating01.patch).
> Unfortunately, the cluster and all other logs are lost, because the report 
> was made about a year ago, but wasn't submitted properly. Also, we don't know 
> if the issue exist in other versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
- Assigned container: 67019 times
- Released container: 67019 times
- Invalid container released: 19 times

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
I checked their resource manager logs.
These events happened.
Assigned container: 67019 times
Released container: 67019 times
Invalid container released: 19 times


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> - Assigned container: 67019 times
> - Released container: 67019 times
> - Invalid container released: 19 times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Kabakchei updated YARN-4698:
---
Description: 
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:
I checked their resource manager logs.
These events happened.
Assigned container: 67019 times
Released container: 67019 times
Invalid container released: 19 times

  was:
We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:


> Negative value in RM UI counters due to double container release
> 
>
> Key: YARN-4698
> URL: https://issues.apache.org/jira/browse/YARN-4698
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler, resourcemanager
>Affects Versions: 2.5.1
>Reporter: Dmytro Kabakchei
>Priority: Minor
>
> We noticed that on our cluster there are negative values in RM UI counters:
> -Containers Running: -19
> -Memory Used: -38GB
> -Vcores Used: -19
> After we checked RM logs, we found, that the following events had happened:
> I checked their resource manager logs.
> These events happened.
> Assigned container: 67019 times
> Released container: 67019 times
> Invalid container released: 19 times



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4694) Document ATS v1.5

2016-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150577#comment-15150577
 ] 

Steve Loughran commented on YARN-4694:
--

Things I'd like to see in here

* overall concepts
* how the FS writer works (including how it could fail)
* how updates propagate —and what your app needs to do
* why is there a summary DB as well as the data, how they integrate
* what the plugins are, why, how to use and deploy.
* how to test
* whether there are any security implications of the new model

> Document ATS v1.5
> -
>
> Key: YARN-4694
> URL: https://issues.apache.org/jira/browse/YARN-4694
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Xuan Gong
>Assignee: Xuan Gong
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (YARN-4698) Negative value in RM UI counters due to double container release

2016-02-17 Thread Dmytro Kabakchei (JIRA)
Dmytro Kabakchei created YARN-4698:
--

 Summary: Negative value in RM UI counters due to double container 
release
 Key: YARN-4698
 URL: https://issues.apache.org/jira/browse/YARN-4698
 Project: Hadoop YARN
  Issue Type: Bug
  Components: fairscheduler, resourcemanager
Affects Versions: 2.5.1
Reporter: Dmytro Kabakchei
Priority: Minor


We noticed that on our cluster there are negative values in RM UI counters:
-Containers Running: -19
-Memory Used: -38GB
-Vcores Used: -19

After we checked RM logs, we found, that the following events had happened:



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM

2016-02-17 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150557#comment-15150557
 ] 

Steve Loughran commented on YARN-4696:
--

It is for testing: the previous ATS ran happily standalone, wheres v 1.5 hangs 
looking for an RM at 0.0.0.0. Some form of mock RM could do it, but 

# line 46; I was thinking about moving all the YarnConfiguration imports to 
static ones ... there were a lot of them. But if not, then not.
# I'll look at findbugs

Mocking is a thought; it might be easiest to isolate the construction/use of 
the yarn client so it's straightforward to mock/subclass

> EntityGroupFSTimelineStore to work in the absence of an RM
> --
>
> Key: YARN-4696
> URL: https://issues.apache.org/jira/browse/YARN-4696
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: timelineserver
>Affects Versions: 2.8.0
>Reporter: Steve Loughran
> Attachments: YARN-4696-001.patch
>
>
> {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the 
> configuration pointing to it. This is a new change, and impacts testing where 
> you have historically been able to test without an RM running.
> The sole purpose of the probe is to automatically determine if an app is 
> running; it falls back to "unknown" if not. If the RM connection was 
> optional, the "unknown" codepath could be called directly, relying on age of 
> file as a metric of completion
> Options
> # add a flag to disable RM connect
> # skip automatically if RM not defined/set to 0.0.0.0
> # disable retries on yarn client IPC; if it fails, tag app as unknown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150523#comment-15150523
 ] 

Naganarasimha G R commented on YARN-4697:
-

hi [~haibochen],
Thanks for working on this patch and yes it would be better to limit the number 
of threads in the executor service. But few nits/queries in the patch :
* if *YarnConfiguration* is modified then better *yarn-default.xml* is also 
modified, but in the first place do we require to keep it configurable ? IMO 
just having a fixed value like 50 should be safe.
* *threadPool* can be of default access instead of public , so that only 
accessible to testcases
* Dint understand the need of *Semaphore*, as in the *Runnable*  immediately 
after *semaphore.acquire()* we release in the finally block. And even if not i 
thought we can submit multiple runnables (say 5/10) with a short sleep and 
check whether number of live threads having thread name as 
LogAggregationService is only 1 right ?

> NM aggregation thread pool is not bound by limits
> -
>
> Key: YARN-4697
> URL: https://issues.apache.org/jira/browse/YARN-4697
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Reporter: Haibo Chen
>Assignee: Haibo Chen
> Attachments: yarn4697.001.patch
>
>
> In the LogAggregationService.java we create a threadpool to upload logs from 
> the nodemanager to HDFS if log aggregation is turned on. This is a cached 
> threadpool which based on the javadoc is an ulimited pool of threads.
> In the case that we have had a problem with log aggregation this could cause 
> a problem on restart. The number of threads created at that point could be 
> huge and will put a large load on the NameNode and in worse case could even 
> bring it down due to file descriptor issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4697) NM aggregation thread pool is not bound by limits

2016-02-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150459#comment-15150459
 ] 

Hadoop QA commented on YARN-4697:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 
0s {color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s 
{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 
42s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
37s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s 
{color} | {color:green} trunk passed with JDK v1.7.0_95 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s 
{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 43s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
34s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
23s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s 
{color} | {color:green} the patch passed with JDK v1.8.0_72 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 11s 
{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 21s {color} 
| {color:red} hadoop-yarn-api in the patch failed with JDK v1.8.0_72. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 48s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.8.0_72. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 23s {color} 
| {color:red} hadoop-yarn-api in the patch failed with JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 19s 
{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with 
JDK v1.7.0_95. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
20s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 53m 18s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_72 

[jira] [Commented] (YARN-4654) Yarn node label CLI should parse "=" correctly when trying to remove all labels on a node

2016-02-17 Thread Naganarasimha G R (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150424#comment-15150424
 ] 

Naganarasimha G R commented on YARN-4654:
-

[~wangda], 
Is the latest patch fine ?

> Yarn node label CLI should parse "=" correctly when trying to remove all 
> labels on a node
> -
>
> Key: YARN-4654
> URL: https://issues.apache.org/jira/browse/YARN-4654
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Wangda Tan
>Assignee: Naganarasimha G R
> Attachments: YARN-4654.v1.001.patch, YARN-4654.v1.002.patch, 
> YARN-4654.v1.003.patch
>
>
> Currently, when adding labels to nodes, user can run:
> {{yarn rmadmin -replaceLabelsOnNode "host1=x host2=y"}}
> However, when removing labels from a node, user has to run:
> {{yarn rmadmin -replaceLabelsOnNode "host1 host2"}}
> Instead of:
> {{yarn rmadmin -replaceLabelsOnNode "host1= host2="}}
> We should handle both of "=" exists/not-exists case when removing labels on a 
> node.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-4547) LeafQueue#getApplications() is read-only interface, but it provides reference to caller

2016-02-17 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15150103#comment-15150103
 ] 

Rohith Sharma K S commented on YARN-4547:
-

I am bit confused that when to use Duplicate, Done, Implemented OR any other 
non regularly usable states. Since YARN-4617 is different issue, and 
incorporated this fixed as part of yarn-4617 patch. So, I precisely chosen as 
Done. Any suggestions are honored. 

> LeafQueue#getApplications() is read-only interface, but it provides reference 
> to caller
> ---
>
> Key: YARN-4547
> URL: https://issues.apache.org/jira/browse/YARN-4547
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler
>Reporter: Rohith Sharma K S
>Assignee: Rohith Sharma K S
>
> The below API is read-only interface, but returning reference to the caller. 
> This causing caller to modify the orderingPolicy entities. If required 
> reference of ordering policy, caller can use 
> {{LeagQueue#getOrderingPolicy()#getSchedulableEntities()}}
> The returning object should be clone of 
> orderingPolicy.getSchedulableEntities()
> {code}
>   /**
>* Obtain (read-only) collection of active applications.
>*/
>   public Collection getApplications() {
> return orderingPolicy.getSchedulableEntities();
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)