[jira] [Commented] (YARN-7681) Scheduler should double-check placement constraint before actual allocation is made

2018-01-07 Thread Arun Suresh (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315524#comment-16315524
 ] 

Arun Suresh commented on YARN-7681:
---

[~cheersyang], When the processor is turned on, the SchedulingRequest does not 
flow through the {{AppPlacementAllocator}}. The code paths are somewhat 
separate.
So, if the processor is turned on, we need to perform the check in the 
[attemptAllocationOnNode|https://github.com/apache/hadoop/blob/YARN-6592/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.java#L2504-L2519]
 method.

> Scheduler should double-check placement constraint before actual allocation 
> is made
> ---
>
> Key: YARN-7681
> URL: https://issues.apache.org/jira/browse/YARN-7681
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: RM, scheduler
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>
> This JIRA is created based on the discussions under YARN-7612, see comments 
> after [this 
> comment|https://issues.apache.org/jira/browse/YARN-7612?focusedCommentId=16303051&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16303051].
>  AllocationTagsManager maintains tags info that helps to make placement 
> decisions at placement phase, however tags are changing along with 
> container's lifecycle, so it is possible that the placement violates the 
> constraints at the scheduling phase. Propose to add an extra check in the 
> scheduler to make sure constraints are still satisfied during the actual 
> allocation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7710) http://ip:8088/cluster show different ID with same name

2018-01-07 Thread jimmy (JIRA)
jimmy created YARN-7710:
---

 Summary: http://ip:8088/cluster show different ID with same name  
 Key: YARN-7710
 URL: https://issues.apache.org/jira/browse/YARN-7710
 Project: Hadoop YARN
  Issue Type: Bug
  Components: applications
Affects Versions: 2.7.3
 Environment: hadoop2.7.3 
jdk 1.8
Reporter: jimmy
Priority: Minor


1.create five thread
2.submit five steamJob with different name
3.visit http://ip:8088 we can see same name for different id sometimes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7710) http://ip:8088/cluster show different ID with same name

2018-01-07 Thread jimmy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315573#comment-16315573
 ] 

jimmy commented on YARN-7710:
-

erery thread use  follow codes to submit application.
StreamJob streamJob = new StreamJob();
streamJob.setConf(jobConf);
jobReturnCode = streamJob.run(String[] args);


> http://ip:8088/cluster show different ID with same name  
> -
>
> Key: YARN-7710
> URL: https://issues.apache.org/jira/browse/YARN-7710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 2.7.3
> Environment: hadoop2.7.3 
> jdk 1.8
>Reporter: jimmy
>Priority: Minor
>
> 1.create five thread
> 2.submit five steamJob with different name
> 3.visit http://ip:8088 we can see same name for different id sometimes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (YARN-7710) http://ip:8088/cluster show different ID with same name

2018-01-07 Thread jimmy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315573#comment-16315573
 ] 

jimmy edited comment on YARN-7710 at 1/8/18 1:45 AM:
-

Every thread use  follow codes to submit application.
StreamJob streamJob = new StreamJob();
streamJob.setConf(jobConf);
jobReturnCode = streamJob.run(String[] args);



was (Author: zjilvufe):
erery thread use  follow codes to submit application.
StreamJob streamJob = new StreamJob();
streamJob.setConf(jobConf);
jobReturnCode = streamJob.run(String[] args);


> http://ip:8088/cluster show different ID with same name  
> -
>
> Key: YARN-7710
> URL: https://issues.apache.org/jira/browse/YARN-7710
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: applications
>Affects Versions: 2.7.3
> Environment: hadoop2.7.3 
> jdk 1.8
>Reporter: jimmy
>Priority: Minor
>
> 1.create five thread
> 2.submit five steamJob with different name
> 3.visit http://ip:8088 we can see same name for different id sometimes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-6486) FairScheduler: Deprecate continuous scheduling

2018-01-07 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-6486:

Attachment: YARN-6486.004.patch

Updated the patch based on the comments.

> FairScheduler: Deprecate continuous scheduling
> --
>
> Key: YARN-6486
> URL: https://issues.apache.org/jira/browse/YARN-6486
> Project: Hadoop YARN
>  Issue Type: Task
>  Components: fairscheduler
>Affects Versions: 2.9.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-6486.001.patch, YARN-6486.002.patch, 
> YARN-6486.003.patch, YARN-6486.004.patch
>
>
> Mark continuous scheduling as deprecated in 2.9 and remove the code in 3.0. 
> Removing continuous scheduling from the code will be logged as a separate jira



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7689) TestRMContainerAllocator fails after YARN-6124

2018-01-07 Thread Wilfred Spiegelenburg (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315577#comment-16315577
 ] 

Wilfred Spiegelenburg commented on YARN-7689:
-

the mvninstall failure is not related to the patch as far as I can trace.

> TestRMContainerAllocator fails after YARN-6124
> --
>
> Key: YARN-7689
> URL: https://issues.apache.org/jira/browse/YARN-7689
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler
>Affects Versions: 3.1.0
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
> Attachments: YARN-7689.001.patch, YARN-7689.002.patch
>
>
> After the change that was made for YARN-6124 multiple tests in the 
> TestRMContainerAllocator from MapReduce fail with the following NPE:
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.AbstractYarnScheduler.reinitialize(AbstractYarnScheduler.java:1437)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler.reinitialize(FifoScheduler.java:320)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$ExcessReduceContainerAllocateScheduler.(TestRMContainerAllocator.java:1808)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager2.createScheduler(TestRMContainerAllocator.java:970)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:659)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1133)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:316)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.serviceInit(MockRM.java:1334)
>   at 
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:162)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:141)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.(MockRM.java:137)
>   at 
> org.apache.hadoop.mapreduce.v2.app.rm.TestRMContainerAllocator$MyResourceManager.(TestRMContainerAllocator.java:928)
> {code}
> In the test we just call reinitiaize on a scheduler and never call init.
> The stop of the service is guarded and so should the start and the re-init.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-4227) FairScheduler: RM quits processing expired container from a removed node

2018-01-07 Thread Wilfred Spiegelenburg (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wilfred Spiegelenburg updated YARN-4227:

Attachment: YARN-4227.006.patch

Updated patch with the isDebugEnabled checks

> FairScheduler: RM quits processing expired container from a removed node
> 
>
> Key: YARN-4227
> URL: https://issues.apache.org/jira/browse/YARN-4227
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: fairscheduler
>Affects Versions: 2.3.0, 2.5.0, 2.7.1
>Reporter: Wilfred Spiegelenburg
>Assignee: Wilfred Spiegelenburg
>Priority: Critical
> Attachments: YARN-4227.006.patch, YARN-4227.2.patch, 
> YARN-4227.3.patch, YARN-4227.4.patch, YARN-4227.5.patch, YARN-4227.patch
>
>
> Under some circumstances the node is removed before an expired container 
> event is processed causing the RM to exit:
> {code}
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: 
> Expired:container_1436927988321_1307950_01_12 Timed out after 600 secs
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.rmcontainer.RMContainerImpl: 
> container_1436927988321_1307950_01_12 Container Transitioned from 
> ACQUIRED to EXPIRED
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp: 
> Completed container: container_1436927988321_1307950_01_12 in state: 
> EXPIRED event:EXPIRE
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=system_op   
>OPERATION=AM Released Container TARGET=SchedulerApp RESULT=SUCCESS  
> APPID=application_1436927988321_1307950 
> CONTAINERID=container_1436927988321_1307950_01_12
> 2015-10-04 21:14:01,063 FATAL 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in 
> handling event type CONTAINER_EXPIRED to the scheduler
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.completedContainer(FairScheduler.java:849)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1273)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:122)
>   at 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:585)
>   at java.lang.Thread.run(Thread.java:745)
> 2015-10-04 21:14:01,063 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
> {code}
> The stack trace is from 2.3.0 but the same issue has been observed in 2.5.0 
> and 2.6.0 by different customers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7699) queueUsagePercentage is coming as INF for getApp REST api call

2018-01-07 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315679#comment-16315679
 ] 

Rohith Sharma K S commented on YARN-7699:
-

+1 lgtm

> queueUsagePercentage is coming as INF for getApp REST api call
> --
>
> Key: YARN-7699
> URL: https://issues.apache.org/jira/browse/YARN-7699
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-7699.001.patch
>
>
> REST o/p for getApp call where queueUsagePercentage is INF.
> Credit to [~charanh]
> {noformat}
> {
> "app": {
> "id": "application_1514964945154_0001",
> "user": "hrt_qa",
> "name": "DistributedShell",
> "queue": "a1",
> "state": "RUNNING",
> "finalStatus": "UNDEFINED",
> "progress": 90.0,
> "trackingUI": "ApplicationMaster",
> "trackingUrl": 
> "http://ctr-e136-1513029738776-28101-01-04.hwx.site:8088/proxy/application_1514964945154_0001/";,
> "diagnostics": "",
> "clusterId": 1514964945154,
> "applicationType": "YARN",
> "applicationTags": "",
> "priority": 0,
> "startedTime": 1514965015754,
> "finishedTime": 0,
> "elapsedTime": 418844,
> "amContainerLogs": 
> "http://ctr-e136-1513029738776-28101-01-07.hwx.site:8042/node/containerlogs/container_e43_1514964945154_0001_01_01/hrt_qa";,
> "amHostHttpAddress": 
> "ctr-e136-1513029738776-28101-01-07.hwx.site:8042",
> "allocatedMB": 600,
> "allocatedVCores": 1,
> "reservedMB": 0,
> "reservedVCores": 0,
> "runningContainers": 1,
> "memorySeconds": 1227070,
> "vcoreSeconds": 2041,
> "queueUsagePercentage": INF,
> "clusterUsagePercentage": 3.2552083,
> "resourceSecondsMap": {
> "entry": {
> "key": "memory-mb",
> "value": "1227070"
> },
> "entry": {
> "key": "vcores",
> "value": "2041"
> }
> },
> "preemptedResourceMB": 0,
> "preemptedResourceVCores": 0,
> "numNonAMContainerPreempted": 0,
> "numAMContainerPreempted": 0,
> "preemptedMemorySeconds": 0,
> "preemptedVcoreSeconds": 0,
> "preemptedResourceSecondsMap": {
> 
> },
> "resourceRequests": [
> {
> "priority": 0,
> "resourceName": "*",
> "capability": {
> "memory": 600,
> "vCores": 1
> },
> "numContainers": 1,
> "relaxLocality": true,
> "nodeLabelExpression": "x",
> "executionTypeRequest": {
> "executionType": "GUARANTEED",
> "enforceExecutionType": false
> },
> "enforceExecutionType": false
> }
> ],
> "logAggregationStatus": "NOT_START",
> "unmanagedApplication": false,
> "appNodeLabelExpression": "x",
> "amNodeLabelExpression": "x",
> "resourceInfo": {
> "resourceUsagesByPartition": [
> {
> "partitionName": "",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 0,
> "vCores": 0
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },
> "amLimit": {
> "memory": 0,
> "vCores": 0
> }
> },
> {
> "partitionName": "x",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 600,
> "vCores": 1
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },
> "amLimit": {
> "memory": 0,
>  

[jira] [Commented] (YARN-7699) queueUsagePercentage is coming as INF for getApp REST api call

2018-01-07 Thread Rohith Sharma K S (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315680#comment-16315680
 ] 

Rohith Sharma K S commented on YARN-7699:
-

committed to trunk! [~sunilg] it appears there are conflicts in branch-3.0 and 
branch-2.* branches. Can you provide patches for other branches?

> queueUsagePercentage is coming as INF for getApp REST api call
> --
>
> Key: YARN-7699
> URL: https://issues.apache.org/jira/browse/YARN-7699
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-7699.001.patch
>
>
> REST o/p for getApp call where queueUsagePercentage is INF.
> Credit to [~charanh]
> {noformat}
> {
> "app": {
> "id": "application_1514964945154_0001",
> "user": "hrt_qa",
> "name": "DistributedShell",
> "queue": "a1",
> "state": "RUNNING",
> "finalStatus": "UNDEFINED",
> "progress": 90.0,
> "trackingUI": "ApplicationMaster",
> "trackingUrl": 
> "http://ctr-e136-1513029738776-28101-01-04.hwx.site:8088/proxy/application_1514964945154_0001/";,
> "diagnostics": "",
> "clusterId": 1514964945154,
> "applicationType": "YARN",
> "applicationTags": "",
> "priority": 0,
> "startedTime": 1514965015754,
> "finishedTime": 0,
> "elapsedTime": 418844,
> "amContainerLogs": 
> "http://ctr-e136-1513029738776-28101-01-07.hwx.site:8042/node/containerlogs/container_e43_1514964945154_0001_01_01/hrt_qa";,
> "amHostHttpAddress": 
> "ctr-e136-1513029738776-28101-01-07.hwx.site:8042",
> "allocatedMB": 600,
> "allocatedVCores": 1,
> "reservedMB": 0,
> "reservedVCores": 0,
> "runningContainers": 1,
> "memorySeconds": 1227070,
> "vcoreSeconds": 2041,
> "queueUsagePercentage": INF,
> "clusterUsagePercentage": 3.2552083,
> "resourceSecondsMap": {
> "entry": {
> "key": "memory-mb",
> "value": "1227070"
> },
> "entry": {
> "key": "vcores",
> "value": "2041"
> }
> },
> "preemptedResourceMB": 0,
> "preemptedResourceVCores": 0,
> "numNonAMContainerPreempted": 0,
> "numAMContainerPreempted": 0,
> "preemptedMemorySeconds": 0,
> "preemptedVcoreSeconds": 0,
> "preemptedResourceSecondsMap": {
> 
> },
> "resourceRequests": [
> {
> "priority": 0,
> "resourceName": "*",
> "capability": {
> "memory": 600,
> "vCores": 1
> },
> "numContainers": 1,
> "relaxLocality": true,
> "nodeLabelExpression": "x",
> "executionTypeRequest": {
> "executionType": "GUARANTEED",
> "enforceExecutionType": false
> },
> "enforceExecutionType": false
> }
> ],
> "logAggregationStatus": "NOT_START",
> "unmanagedApplication": false,
> "appNodeLabelExpression": "x",
> "amNodeLabelExpression": "x",
> "resourceInfo": {
> "resourceUsagesByPartition": [
> {
> "partitionName": "",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 0,
> "vCores": 0
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },
> "amLimit": {
> "memory": 0,
> "vCores": 0
> }
> },
> {
> "partitionName": "x",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 600,
> "vCores": 1
> },
> "amUsed": {
> "memory": 0,
>  

[jira] [Updated] (YARN-7699) queueUsagePercentage is coming as INF for getApp REST api call

2018-01-07 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-7699:
--
Attachment: YARN-7699-branch-3.0.001.patch

Thanks [~rohithsharma]
This patch could be applied in branch-3.0, branch-2 and branch-2.9.

> queueUsagePercentage is coming as INF for getApp REST api call
> --
>
> Key: YARN-7699
> URL: https://issues.apache.org/jira/browse/YARN-7699
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-7699-branch-3.0.001.patch, YARN-7699.001.patch
>
>
> REST o/p for getApp call where queueUsagePercentage is INF.
> Credit to [~charanh]
> {noformat}
> {
> "app": {
> "id": "application_1514964945154_0001",
> "user": "hrt_qa",
> "name": "DistributedShell",
> "queue": "a1",
> "state": "RUNNING",
> "finalStatus": "UNDEFINED",
> "progress": 90.0,
> "trackingUI": "ApplicationMaster",
> "trackingUrl": 
> "http://ctr-e136-1513029738776-28101-01-04.hwx.site:8088/proxy/application_1514964945154_0001/";,
> "diagnostics": "",
> "clusterId": 1514964945154,
> "applicationType": "YARN",
> "applicationTags": "",
> "priority": 0,
> "startedTime": 1514965015754,
> "finishedTime": 0,
> "elapsedTime": 418844,
> "amContainerLogs": 
> "http://ctr-e136-1513029738776-28101-01-07.hwx.site:8042/node/containerlogs/container_e43_1514964945154_0001_01_01/hrt_qa";,
> "amHostHttpAddress": 
> "ctr-e136-1513029738776-28101-01-07.hwx.site:8042",
> "allocatedMB": 600,
> "allocatedVCores": 1,
> "reservedMB": 0,
> "reservedVCores": 0,
> "runningContainers": 1,
> "memorySeconds": 1227070,
> "vcoreSeconds": 2041,
> "queueUsagePercentage": INF,
> "clusterUsagePercentage": 3.2552083,
> "resourceSecondsMap": {
> "entry": {
> "key": "memory-mb",
> "value": "1227070"
> },
> "entry": {
> "key": "vcores",
> "value": "2041"
> }
> },
> "preemptedResourceMB": 0,
> "preemptedResourceVCores": 0,
> "numNonAMContainerPreempted": 0,
> "numAMContainerPreempted": 0,
> "preemptedMemorySeconds": 0,
> "preemptedVcoreSeconds": 0,
> "preemptedResourceSecondsMap": {
> 
> },
> "resourceRequests": [
> {
> "priority": 0,
> "resourceName": "*",
> "capability": {
> "memory": 600,
> "vCores": 1
> },
> "numContainers": 1,
> "relaxLocality": true,
> "nodeLabelExpression": "x",
> "executionTypeRequest": {
> "executionType": "GUARANTEED",
> "enforceExecutionType": false
> },
> "enforceExecutionType": false
> }
> ],
> "logAggregationStatus": "NOT_START",
> "unmanagedApplication": false,
> "appNodeLabelExpression": "x",
> "amNodeLabelExpression": "x",
> "resourceInfo": {
> "resourceUsagesByPartition": [
> {
> "partitionName": "",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 0,
> "vCores": 0
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },
> "amLimit": {
> "memory": 0,
> "vCores": 0
> }
> },
> {
> "partitionName": "x",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 600,
> "vCores": 1
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },

[jira] [Updated] (YARN-7242) Support to specify values of different resource types in DistributedShell for easier testing

2018-01-07 Thread Sunil G (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil G updated YARN-7242:
--
Summary: Support to specify values of different resource types in 
DistributedShell for easier testing  (was: Support specify values of different 
resource types in DistributedShell for easier testing)

> Support to specify values of different resource types in DistributedShell for 
> easier testing
> 
>
> Key: YARN-7242
> URL: https://issues.apache.org/jira/browse/YARN-7242
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Gergely Novák
>Priority: Critical
>  Labels: newbie
> Attachments: YARN-7242.001.patch, YARN-7242.002.patch, 
> YARN-7242.003.patch, YARN-7242.004.patch, YARN-7242.005.patch
>
>
> Currently, DS supports specify resource profile, it's better to allow user to 
> directly specify resource keys/values from command line.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7242) Support to specify values of different resource types in DistributedShell for easier testing

2018-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315789#comment-16315789
 ] 

Hudson commented on YARN-7242:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13459 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13459/])
YARN-7242. Support to specify values of different resource types in (sunilg: 
rev 01f3f2167ec20b52a18bc2cf250fb4229cfd2c14)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/test/java/org/apache/hadoop/yarn/applications/distributedshell/TestDistributedShell.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/ApplicationMaster.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java


> Support to specify values of different resource types in DistributedShell for 
> easier testing
> 
>
> Key: YARN-7242
> URL: https://issues.apache.org/jira/browse/YARN-7242
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: nodemanager, resourcemanager
>Reporter: Wangda Tan
>Assignee: Gergely Novák
>Priority: Critical
>  Labels: newbie
> Fix For: 3.1.0
>
> Attachments: YARN-7242.001.patch, YARN-7242.002.patch, 
> YARN-7242.003.patch, YARN-7242.004.patch, YARN-7242.005.patch
>
>
> Currently, DS supports specify resource profile, it's better to allow user to 
> directly specify resource keys/values from command line.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-7699) queueUsagePercentage is coming as INF for getApp REST api call

2018-01-07 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-7699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315788#comment-16315788
 ] 

Hudson commented on YARN-7699:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13459 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13459/])
YARN-7699. queueUsagePercentage is coming as INF for getApp REST api 
(rohithsharmaks: rev c2d6fa36560d122ff24dd7db84f68f4ba3fb8123)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp.java


> queueUsagePercentage is coming as INF for getApp REST api call
> --
>
> Key: YARN-7699
> URL: https://issues.apache.org/jira/browse/YARN-7699
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 2.9.0, 3.0.0
>Reporter: Sunil G
>Assignee: Sunil G
> Attachments: YARN-7699-branch-3.0.001.patch, YARN-7699.001.patch
>
>
> REST o/p for getApp call where queueUsagePercentage is INF.
> Credit to [~charanh]
> {noformat}
> {
> "app": {
> "id": "application_1514964945154_0001",
> "user": "hrt_qa",
> "name": "DistributedShell",
> "queue": "a1",
> "state": "RUNNING",
> "finalStatus": "UNDEFINED",
> "progress": 90.0,
> "trackingUI": "ApplicationMaster",
> "trackingUrl": 
> "http://ctr-e136-1513029738776-28101-01-04.hwx.site:8088/proxy/application_1514964945154_0001/";,
> "diagnostics": "",
> "clusterId": 1514964945154,
> "applicationType": "YARN",
> "applicationTags": "",
> "priority": 0,
> "startedTime": 1514965015754,
> "finishedTime": 0,
> "elapsedTime": 418844,
> "amContainerLogs": 
> "http://ctr-e136-1513029738776-28101-01-07.hwx.site:8042/node/containerlogs/container_e43_1514964945154_0001_01_01/hrt_qa";,
> "amHostHttpAddress": 
> "ctr-e136-1513029738776-28101-01-07.hwx.site:8042",
> "allocatedMB": 600,
> "allocatedVCores": 1,
> "reservedMB": 0,
> "reservedVCores": 0,
> "runningContainers": 1,
> "memorySeconds": 1227070,
> "vcoreSeconds": 2041,
> "queueUsagePercentage": INF,
> "clusterUsagePercentage": 3.2552083,
> "resourceSecondsMap": {
> "entry": {
> "key": "memory-mb",
> "value": "1227070"
> },
> "entry": {
> "key": "vcores",
> "value": "2041"
> }
> },
> "preemptedResourceMB": 0,
> "preemptedResourceVCores": 0,
> "numNonAMContainerPreempted": 0,
> "numAMContainerPreempted": 0,
> "preemptedMemorySeconds": 0,
> "preemptedVcoreSeconds": 0,
> "preemptedResourceSecondsMap": {
> 
> },
> "resourceRequests": [
> {
> "priority": 0,
> "resourceName": "*",
> "capability": {
> "memory": 600,
> "vCores": 1
> },
> "numContainers": 1,
> "relaxLocality": true,
> "nodeLabelExpression": "x",
> "executionTypeRequest": {
> "executionType": "GUARANTEED",
> "enforceExecutionType": false
> },
> "enforceExecutionType": false
> }
> ],
> "logAggregationStatus": "NOT_START",
> "unmanagedApplication": false,
> "appNodeLabelExpression": "x",
> "amNodeLabelExpression": "x",
> "resourceInfo": {
> "resourceUsagesByPartition": [
> {
> "partitionName": "",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved": {
> "memory": 0,
> "vCores": 0
> },
> "pending": {
> "memory": 0,
> "vCores": 0
> },
> "amUsed": {
> "memory": 0,
> "vCores": 0
> },
> "amLimit": {
> "memory": 0,
> "vCores": 0
> }
> },
> {
> "partitionName": "x",
> "used": {
> "memory": 0,
> "vCores": 0
> },
> "reserved"

[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2018-01-07 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315795#comment-16315795
 ] 

Prabhu Joseph commented on YARN-6929:
-

[~jlowe] Can you review this when you get some time. Thanks.

> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
> Attachments: YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, 
> YARN-6929.3.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>  
> at 
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) 
> at java.lang.Thread.run(Thread.java:745) 
> Caused by: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1

[jira] [Commented] (YARN-6929) yarn.nodemanager.remote-app-log-dir structure is not scalable

2018-01-07 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-6929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315808#comment-16315808
 ] 

genericqa commented on YARN-6929:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-6929 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-6929 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12894301/YARN-6929.3.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/19139/console |
| Powered by | Apache Yetus 0.7.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> yarn.nodemanager.remote-app-log-dir structure is not scalable
> -
>
> Key: YARN-6929
> URL: https://issues.apache.org/jira/browse/YARN-6929
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: log-aggregation
>Affects Versions: 2.7.3
>Reporter: Prabhu Joseph
>Assignee: Prabhu Joseph
> Attachments: YARN-6929.1.patch, YARN-6929.2.patch, YARN-6929.2.patch, 
> YARN-6929.3.patch, YARN-6929.patch
>
>
> The current directory structure for yarn.nodemanager.remote-app-log-dir is 
> not scalable. Maximum Subdirectory limit by default is 1048576 (HDFS-6102). 
> With retention yarn.log-aggregation.retain-seconds of 7days, there are more 
> chances LogAggregationService fails to create a new directory with 
> FSLimitException$MaxDirectoryItemsExceededException.
> The current structure is 
> //logs/. This can be 
> improved with adding date as a subdirectory like 
> //logs// 
> {code}
> WARN 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService:
>  Application failed to init aggregation 
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
>  The directory item limit of /app-logs/yarn/logs is exceeded: limit=1048576 
> items=1048576 
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyMaxDirItems(FSDirectory.java:2021)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:2072)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedMkdir(FSDirectory.java:1841)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsRecursively(FSNamesystem.java:4351)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4262)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
>  
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
>  
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
>  
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>  
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) 
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:415) 
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>  
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) 
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.createAppDir(LogAggregationService.java:308)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initAppAggregator(LogAggregationService.java:366)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.initApp(LogAggregationService.java:320)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:443)
>  
> at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService.handle(LogAggregationService.java:67)
>

[jira] [Updated] (YARN-7711) YARN UI2 should redirect into Active RM in HA

2018-01-07 Thread Rohith Sharma K S (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-7711:

Summary: YARN UI2 should redirect into Active RM in HA  (was: New YARN UI 
should redirect into Active RM in HA)

> YARN UI2 should redirect into Active RM in HA
> -
>
> Key: YARN-7711
> URL: https://issues.apache.org/jira/browse/YARN-7711
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Rohith Sharma K S
>
> When UI2 is enabled in HA mode, if REST query goes into stand by RM, then it 
> is not redirecting into active RM. 
> It should redirect into Active RM as old UI redirect into active!
> cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-7711) New YARN UI should redirect into Active RM in HA

2018-01-07 Thread Rohith Sharma K S (JIRA)
Rohith Sharma K S created YARN-7711:
---

 Summary: New YARN UI should redirect into Active RM in HA
 Key: YARN-7711
 URL: https://issues.apache.org/jira/browse/YARN-7711
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Rohith Sharma K S


When UI2 is enabled in HA mode, if REST query goes into stand by RM, then it is 
not redirecting into active RM. 
It should redirect into Active RM as old UI redirect into active!

cc :/ [~sunilg]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org