date:20180820

[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7

2018-08-20 Thread jialei weng (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-8690:
--
Description: 
With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
hdfs://hostname/path, to local resource allocation as it will be resolved to 
hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in 
path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It 
isn't a consistent change. If we can make it consistent  without customer's 
change? [~leftnoteasy]
|Handle resource location path in 2.7|Handle resource location logic in 2.9|
|  public static Path getPathFromYarnURL(URL url) throws URISyntaxException {
     String scheme = url.getScheme() == null ? "" : url.getScheme();
    
 String authority = "";
     if (url.getHost() != null) {
  authority = url.getHost();
    if (url.getUserInfo() != null) {
     authority = url.getUserInfo() + "@" + authority;  
      }
{color:#d04437}  if (url.getPort() > 0) {{color}{color:#d04437}    
authority += ":" + url.getPort();{color}{color:#d04437}  }{color}    }
   
 return new Path(
     (new URI(scheme, authority, url.getFile(), null, null)).normalize());
   }|    public Path toPath() throws URISyntaxException \{     return new 
Path(new URI(getScheme(), getUserInfo(),   getHost(), getPort(), getFile(), 
null, null));   }|

  was:
With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
hdfs://hostname/path, to local resource allocation as it will be resolved to 
hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in 
path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It 
isn't a consistent change. If we can make it consistent  without customer's 
change? [~leftnoteasy]
|Handle resource location path in 2.7|Handle resource location logic in 2.9|
|  public static Path getPathFromYarnURL(URL url) throws URISyntaxException {
    String scheme = url.getScheme() == null ? "" : url.getScheme();
   
String authority = "";
    if (url.getHost() != null) {
  authority = url.getHost();
  if (url.getUserInfo() != null) {
    authority = url.getUserInfo() + "@" + authority;
  }
{color:#d04437}  if (url.getPort() > 0) {{color}
{color:#d04437}    authority += ":" + url.getPort();{color}
{color:#d04437}  }{color}
    }
   
return new Path(
    (new URI(scheme, authority, url.getFile(), null, null)).normalize());
  }|    public Path toPath() throws URISyntaxException {
    return new Path(new URI(getScheme(), getUserInfo(),
  getHost(), getPort(), getFile(), null, null));
  }|


> Currently path not consistent in LocalResourceRequest to yarn 2.7
> -
>
> Key: YARN-8690
> URL: https://issues.apache.org/jira/browse/YARN-8690
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.1
>Reporter: jialei weng
>Assignee: Wangda Tan
>Priority: Major
>
> With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
> hdfs://hostname/path, to local resource allocation as it will be resolved to 
> hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in 
> path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. 
> It isn't a consistent change. If we can make it consistent  without 
> customer's change? [~leftnoteasy]
> |Handle resource location path in 2.7|Handle resource location logic in 2.9|
> |  public static Path getPathFromYarnURL(URL url) throws URISyntaxException {
>      String scheme = url.getScheme() == null ? "" : url.getScheme();
>     
>  String authority = "";
>      if (url.getHost() != null) {
>   authority = url.getHost();
>     if (url.getUserInfo() != null) {
>      authority = url.getUserInfo() + "@" + authority;  
>       }
> {color:#d04437}  if (url.getPort() > 0) {{color}{color:#d04437}    
> authority += ":" + url.getPort();{color}{color:#d04437}  }{color}    }
>    
>  return new Path(
>      (new URI(scheme, authority, url.getFile(), null, null)).normalize());
>    }|    public Path toPath() throws URISyntaxException \{     return new 
> Path(new URI(getScheme(), getUserInfo(),   getHost(), getPort(), 
> getFile(), null, null));   }|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7

2018-08-20 Thread jialei weng (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jialei weng updated YARN-8690:
--
Description: 
With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
hdfs://hostname/path, to local resource allocation as it will be resolved to 
hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in 
path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It 
isn't a consistent change. If we can make it consistent  without customer's 
change? [~leftnoteasy]
|Handle resource location path in 2.7|Handle resource location logic in 2.9|
|  public static Path getPathFromYarnURL(URL url) throws URISyntaxException {
    String scheme = url.getScheme() == null ? "" : url.getScheme();
   
String authority = "";
    if (url.getHost() != null) {
  authority = url.getHost();
  if (url.getUserInfo() != null) {
    authority = url.getUserInfo() + "@" + authority;
  }
{color:#d04437}  if (url.getPort() > 0) {{color}
{color:#d04437}    authority += ":" + url.getPort();{color}
{color:#d04437}  }{color}
    }
   
return new Path(
    (new URI(scheme, authority, url.getFile(), null, null)).normalize());
  }|    public Path toPath() throws URISyntaxException {
    return new Path(new URI(getScheme(), getUserInfo(),
  getHost(), getPort(), getFile(), null, null));
  }|

  was:With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
hdfs://hostname/path, to local resource allocation as it will be resolved to 
hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in 
path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It 
isn't a consistent change. If we can make it consistent  without customer's 
change? [~leftnoteasy]


> Currently path not consistent in LocalResourceRequest to yarn 2.7
> -
>
> Key: YARN-8690
> URL: https://issues.apache.org/jira/browse/YARN-8690
> Project: Hadoop YARN
>  Issue Type: Bug
>Affects Versions: 2.9.1
>Reporter: jialei weng
>Assignee: Wangda Tan
>Priority: Major
>
> With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
> hdfs://hostname/path, to local resource allocation as it will be resolved to 
> hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in 
> path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. 
> It isn't a consistent change. If we can make it consistent  without 
> customer's change? [~leftnoteasy]
> |Handle resource location path in 2.7|Handle resource location logic in 2.9|
> |  public static Path getPathFromYarnURL(URL url) throws URISyntaxException {
>     String scheme = url.getScheme() == null ? "" : url.getScheme();
>    
> String authority = "";
>     if (url.getHost() != null) {
>   authority = url.getHost();
>   if (url.getUserInfo() != null) {
>     authority = url.getUserInfo() + "@" + authority;
>   }
> {color:#d04437}  if (url.getPort() > 0) {{color}
> {color:#d04437}    authority += ":" + url.getPort();{color}
> {color:#d04437}  }{color}
>     }
>    
> return new Path(
>     (new URI(scheme, authority, url.getFile(), null, null)).normalize());
>   }|    public Path toPath() throws URISyntaxException {
>     return new Path(new URI(getScheme(), getUserInfo(),
>   getHost(), getPort(), getFile(), null, null));
>   }|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7

2018-08-20 Thread jialei weng (JIRA)

jialei weng created YARN-8690:
-

 Summary: Currently path not consistent in LocalResourceRequest to 
yarn 2.7
 Key: YARN-8690
 URL: https://issues.apache.org/jira/browse/YARN-8690
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 2.9.1
Reporter: jialei weng
Assignee: Wangda Tan


With YARN-1953 change,  in yarn 2.9.1, we can not use path, like 
hdfs://hostname/path, to local resource allocation as it will be resolved to 
hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in 
path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It 
isn't a consistent change. If we can make it consistent  without customer's 
change? [~leftnoteasy]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586902#comment-16586902
 ] 

Tao Yang commented on YARN-8685:


Hi [~cheersyang],  it's better if we can get these from NM containers rest API, 
but NM only know the running containers, we have to get other containers which 
are in ALLOCATED/ACQUIRED state from RM. There is a lot of difference for 
ContainerInfo between RM and NM, just like AppInfo. Thoughts?

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-20 Thread Wangda Tan (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586854#comment-16586854
 ] 

Wangda Tan edited comment on YARN-8513 at 8/21/18 3:37 AM:
---

Interesting, [~cheersyang], 

I can only think about reservation allocation causes the issue, but given we 
already have logic below, it should not happen:

{code} 
// And it should not be a reserved container
if (assignment.getAssignmentInformation().getNumReservations() > 0) {
  return false;
}
{code} 

We should be able to see what kind of allocation causes the issue, or is it 
possible that CSAssignment indicate allocation happens but actually it doesn't.

What is the {{maximum-container-assignments}} settings now?


was (Author: leftnoteasy):
Interesting, [~cheersyang], 

I can only think about reservation allocation causes the issue: 

{code} 
// And it should not be a reserved container
if (assignment.getAssignmentInformation().getNumReservations() > 0) {
  return false;
}
{code} 

We should be able to see what kind of allocation causes the issue, or is it 
possible that CSAssignment indicate allocation happens but actually it doesn't.

What is the {{maximum-container-assignments}} settings now?

> CapacityScheduler infinite loop when queue is near fully utilized
> -
>
> Key: YARN-8513
> URL: https://issues.apache.org/jira/browse/YARN-8513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 3.1.0, 2.9.1
> Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>Reporter: Chen Yufei
>Priority: Major
> Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used= 
> cluster=}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_01 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource= type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-2497) Fair scheduler should support strict node labels

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586856#comment-16586856
 ] 

genericqa commented on YARN-2497:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} YARN-2497 does not apply to branch-3.0. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-2497 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12895017/YARN-2497.branch-3.0.001.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21642/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fair scheduler should support strict node labels
> 
>
> Key: YARN-2497
> URL: https://issues.apache.org/jira/browse/YARN-2497
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: fairscheduler
>Reporter: Wangda Tan
>Assignee: Daniel Templeton
>Priority: Major
> Attachments: YARN-2497.001.patch, YARN-2497.002.patch, 
> YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, 
> YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, 
> YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, 
> YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-20 Thread Wangda Tan (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586854#comment-16586854
 ] 

Wangda Tan commented on YARN-8513:
--

Interesting, [~cheersyang], 

I can only think about reservation allocation causes the issue: 

{code} 
// And it should not be a reserved container
if (assignment.getAssignmentInformation().getNumReservations() > 0) {
  return false;
}
{code} 

We should be able to see what kind of allocation causes the issue, or is it 
possible that CSAssignment indicate allocation happens but actually it doesn't.

What is the {{maximum-container-assignments}} settings now?

> CapacityScheduler infinite loop when queue is near fully utilized
> -
>
> Key: YARN-8513
> URL: https://issues.apache.org/jira/browse/YARN-8513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 3.1.0, 2.9.1
> Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>Reporter: Chen Yufei
>Priority: Major
> Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used= 
> cluster=}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_01 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource= type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8689) Support partition fairness in fair scheduler strict node label

2018-08-20 Thread zhuqi (JIRA)

zhuqi created YARN-8689:
---

 Summary: Support partition fairness in fair scheduler strict node 
label
 Key: YARN-8689
 URL: https://issues.apache.org/jira/browse/YARN-8689
 Project: Hadoop YARN
  Issue Type: Task
  Components: fairscheduler
Reporter: zhuqi






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-08-20 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586811#comment-16586811
 ] 

Weiwei Yang commented on YARN-8513:
---

Discussed with [~cyfdecyf] in the slack channel, it looks like this issue was 
caused by the greedy container assignments per HB mechanism, for some reason, 
it never stop trying assign new containers for this particular node in a while 
loop. I suggested to add following config to work-around
{noformat}
“yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enable”=“true”
“yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments”=“10”
{noformat}
At the mean time, [~cyfdecyf] is trying to apply this config changes to their 
cluster to see if that helps, and I am trying to reproduce this issue locally.

> CapacityScheduler infinite loop when queue is near fully utilized
> -
>
> Key: YARN-8513
> URL: https://issues.apache.org/jira/browse/YARN-8513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 3.1.0, 2.9.1
> Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>Reporter: Chen Yufei
>Priority: Major
> Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used= 
> cluster=}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_01 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource= type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8688) Duplicate queue names in fair scheduler allocation file

2018-08-20 Thread Shen Yinjie (JIRA)

Shen Yinjie created YARN-8688:
-

 Summary: Duplicate queue names in fair scheduler  allocation file
 Key: YARN-8688
 URL: https://issues.apache.org/jira/browse/YARN-8688
 Project: Hadoop YARN
  Issue Type: Improvement
Affects Versions: 3.1.0, 2.8.2
Reporter: Shen Yinjie


when config++ duplicate queue names in fair scheduler  allocation file, RM 
cannot  recognized the error even if restart RM.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586769#comment-16586769
 ] 

Eric Yang commented on YARN-8569:
-

First patch for demo what the interface looks like.  Here is my testing json 
for launching the app:
{code}
{
  "name": "sleeper-service",
  "version": "1.0",
  "components" :
  [
{
  "name": "ping",
  "number_of_containers": 2,
  "artifact": {
"id": "hadoop/centos:latest",
"type": "DOCKER"
  },
  "launch_command": "sleep,1",
  "resource": {
"cpus": 1,
"memory": "256"
  },
  "restart_policy": "NEVER",
  "configuration": {
"env": {
  "YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL":"true",
  "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true",
  "YARN_CONTAINER_RUNTIME_YARN_SYSFS":"true"
},
"properties": {
  "docker.network": "host"
}
  }
}
  ]
}
{code}

The patch will localize a copy of service.json in HDFS, and distribute it as a 
localized resource to nodes that are going to start the container.  The other 
copy of [appname].json is not used because the content of the file changes too 
frequently that triggers IOException while localizing.

Let me know if this is the direction that we want to continue.  If all looks 
good, I will provide live update to this file when application completes 
transition between STARTED, FLEXING or STABLE, etc.

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8569) Create an interface to provide cluster information to application

2018-08-20 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8569:

Attachment: YARN-8569.001.patch

> Create an interface to provide cluster information to application
> -
>
> Key: YARN-8569
> URL: https://issues.apache.org/jira/browse/YARN-8569
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Eric Yang
>Assignee: Eric Yang
>Priority: Major
>  Labels: Docker
> Attachments: YARN-8569.001.patch
>
>
> Some program requires container hostnames to be known for application to run. 
>  For example, distributed tensorflow requires launch_command that looks like:
> {code}
> # On ps0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=0
> # On ps1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=ps --task_index=1
> # On worker0.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=0
> # On worker1.example.com:
> $ python trainer.py \
>  --ps_hosts=ps0.example.com:,ps1.example.com: \
>  --worker_hosts=worker0.example.com:,worker1.example.com: \
>  --job_name=worker --task_index=1
> {code}
> This is a bit cumbersome to orchestrate via Distributed Shell, or YARN 
> services launch_command.  In addition, the dynamic parameters do not work 
> with YARN flex command.  This is the classic pain point for application 
> developer attempt to automate system environment settings as parameter to end 
> user application.
> It would be great if YARN Docker integration can provide a simple option to 
> expose hostnames of the yarn service via a mounted file.  The file content 
> gets updated when flex command is performed.  This allows application 
> developer to consume system environment settings via a standard interface.  
> It is like /proc/devices for Linux, but for Hadoop.  This may involve 
> updating a file in distributed cache, and allow mounting of the file via 
> container-executor.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586721#comment-16586721
 ] 

genericqa commented on YARN-8509:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 13 new + 949 unchanged - 5 fixed = 962 total (was 954) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 43s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m  6s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}127m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits |
|   | 
hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicy
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimitsByPartition
 |
|   | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy 
|
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8509 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936340/YARN-8509.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c057753f2232 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586667#comment-16586667
 ] 

genericqa commented on YARN-8298:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  7s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
34s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 22s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 387 unchanged - 2 fixed = 393 total (was 389) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 34s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 
13s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
53s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
56s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m 28s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8298 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936332/YARN-8298.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs

[jira] [Commented] (YARN-3611) Support Docker Containers In LinuxContainerExecutor

2018-08-20 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586630#comment-16586630
 ] 

Shane Kumpf commented on YARN-3611:
---

[~zhouyunfan] - thank you for your interest! Please see the [YARN 
containerization 
docs|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md]
 as a starting point. If you have specific questions after that, please do 
reach out on the hadoop-user mailing list.

> Support Docker Containers In LinuxContainerExecutor
> ---
>
> Key: YARN-3611
> URL: https://issues.apache.org/jira/browse/YARN-3611
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Sidharta Seethana
>Assignee: Sidharta Seethana
>Priority: Major
>  Labels: Docker
>
> Support Docker Containers In LinuxContainerExecutor
> LinuxContainerExecutor provides useful functionality today with respect to 
> localization, cgroups based resource management and isolation for CPU, 
> network, disk etc. as well as security with a well-defined mechanism to 
> execute privileged operations using the container-executor utility.  Bringing 
> docker support to LinuxContainerExecutor lets us use all of this 
> functionality when running docker containers under YARN, while not requiring 
> users and admins to configure and use a different ContainerExecutor. 
> There are several aspects here that need to be worked through :
> * Mechanism(s) to let clients request docker-specific functionality - we 
> could initially implement this via environment variables without impacting 
> the client API.
> * Security - both docker daemon as well as application
> * Docker image localization
> * Running a docker container via container-executor as a specified user
> * “Isolate” the docker container in terms of CPU/network/disk/etc
> * Communicating with and/or signaling the running container (ensure correct 
> pid handling)
> * Figure out workarounds for certain performance-sensitive scenarios like 
> HDFS short-circuit reads 
> * All of these need to be achieved without changing the current behavior of 
> LinuxContainerExecutor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8623) Update Docker examples to use image which exists

2018-08-20 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586592#comment-16586592
 ] 

Shane Kumpf edited comment on YARN-8623 at 8/20/18 10:34 PM:
-

[~elek] - thanks, those details are helpful. It does appear 
_apache/hadoop-runner_ is closer to what we want than I originally thought, but 
the user setup clashes with our needs. With a goal of trying to provide a 
working MR pi example, MapReduce expects to run (and write data) as the end 
user (or a static local user, such as nobody, depending on config), so we need 
to propagate the user identity into the container. I expect Spark needs this as 
well.

Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} 
in the starter script via an env variable, or opening up the sudo rules would 
all seem to work to allow us to use this for YARN as well.

I think we should open a separate HADOOP Jira to discuss making the image work 
for both cases if that makes sense to others. [~elek] [~ccondit-target] 
thoughts?


was (Author: shaneku...@gmail.com):
[~elek] - thanks, those details are helpful. It does appear 
_apache/hadoop-runner_ is closer to what we want than I originally thought, but 
the user setup clashes with our needs. With a goal of trying to provide a 
working MR pi example, MapReduce expects to run (and write data) as the end 
user (or a static local user, such as nobody, depending on config). I expect 
Spark does as well.

Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} 
in the starter script via an env variable, or opening up the sudo rules would 
all seem to work to allow us to use this for YARN as well.

I think we should open a separate HADOOP Jira to discuss making the image work 
for both cases if that makes sense to others. [~elek] [~ccondit-target] 
thoughts?

> Update Docker examples to use image which exists
> 
>
> Key: YARN-8623
> URL: https://issues.apache.org/jira/browse/YARN-8623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Priority: Minor
>  Labels: Docker
>
> The example Docker image given in the documentation 
> (images/hadoop-docker:latest) does not exist. We could change 
> images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. 
> We'd need to do a quick sanity test to see if the image works with YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8623) Update Docker examples to use image which exists

2018-08-20 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586601#comment-16586601
 ] 

Shane Kumpf commented on YARN-8623:
---

I was able to run the below MR pi job with the modified _apache/hadoop-runner_ 
image, after a quick hack to the sudo rules.
{code:java}
YARN_EXAMPLES_JAR=$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar
IMAGE_ID="local/hadoop-runner-new:latest"
MOUNTS="/usr/local/hadoop:/usr/local/hadoop:ro,/etc/hadoop/conf:/etc/hadoop/conf:ro,/etc/passwd:/etc/passwd:ro,/etc/group:/etc/group:ro"

yarn jar $YARN_EXAMPLES_JAR pi \
-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_TYPE=docker \
-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS \
-Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID \
-Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_TYPE=docker \
-Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS \
-Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID 1 
4{code}
Hadoop bits were installed to {{/usr/local/hadoop}} on the host. Hadoop config 
in {{/etc/hadoop/conf}} on the host. The appropriate mounts were added to 
{{docker.allowed.ro-mounts}} and the image prefix to 
{{docker.trusted.registries}} in {{container-executor.cfg}}

The above assumes the use of {{/etc/passwd}} and {{/etc/group}} for propagating 
the user and group into the container. We should point to the other ways of 
[managing user 
propagation|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md#user-management-in-docker-container]
 as part of this example documentation.

> Update Docker examples to use image which exists
> 
>
> Key: YARN-8623
> URL: https://issues.apache.org/jira/browse/YARN-8623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Priority: Minor
>  Labels: Docker
>
> The example Docker image given in the documentation 
> (images/hadoop-docker:latest) does not exist. We could change 
> images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. 
> We'd need to do a quick sanity test to see if the image works with YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent

2018-08-20 Thread Zian Chen (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zian Chen updated YARN-8509:

Attachment: YARN-8509.004.patch

> Total pending resource calculation in preemption should use user-limit factor 
> instead of minimum-user-limit-percent
> ---
>
> Key: YARN-8509
> URL: https://issues.apache.org/jira/browse/YARN-8509
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: yarn
>Reporter: Zian Chen
>Assignee: Zian Chen
>Priority: Major
>  Labels: capacityscheduler
> Attachments: YARN-8509.001.patch, YARN-8509.002.patch, 
> YARN-8509.003.patch, YARN-8509.004.patch
>
>
> In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total 
> pending resource based on user-limit percent and user-limit factor which will 
> cap pending resource for each user to the minimum of user-limit pending and 
> actual pending. This will prevent queue from taking more pending resource to 
> achieve queue balance after all queue satisfied with its ideal allocation.
>   
>  We need to change the logic to let queue pending can go beyond userlimit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8623) Update Docker examples to use image which exists

2018-08-20 Thread Shane Kumpf (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586592#comment-16586592
 ] 

Shane Kumpf commented on YARN-8623:
---

[~elek] - thanks, those details are helpful. It does appear 
_apache/hadoop-runner_ is closer to what we want than I originally thought, but 
the user setup clashes with our needs. With a goal of trying to provide a 
working MR pi example, MapReduce expects to run (and write data) as the end 
user (or a static local user, such as nobody, depending on config). I expect 
Spark does as well.

Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} 
in the starter script via an env variable, or opening up the sudo rules would 
all seem to work to allow us to use this for YARN as well.

I think we should open a separate HADOOP Jira to discuss making the image work 
for both cases if that makes sense to others. [~elek] [~ccondit-target] 
thoughts?

> Update Docker examples to use image which exists
> 
>
> Key: YARN-8623
> URL: https://issues.apache.org/jira/browse/YARN-8623
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Craig Condit
>Priority: Minor
>  Labels: Docker
>
> The example Docker image given in the documentation 
> (images/hadoop-docker:latest) does not exist. We could change 
> images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. 
> We'd need to do a quick sanity test to see if the image works with YARN.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586591#comment-16586591
 ] 

genericqa commented on YARN-8673:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
54s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
3s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
24s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
11s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
24s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m  
8s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 
52s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}126m 52s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 |
| JIRA Issue | YARN-8673 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936319/YARN-8673-branch-2.v2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3892fdc4613e 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 18ebe18 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21638/testReport/ |
| Max. process+thread count | 788 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common

[jira] [Updated] (YARN-8599) Build Master module for MaWo app

2018-08-20 Thread Yesha Vora (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-8599:
-
Attachment: YARN-8599.001.patch

> Build Master module for MaWo app
> 
>
> Key: YARN-8599
> URL: https://issues.apache.org/jira/browse/YARN-8599
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: YARN-8599.001.patch
>
>
> Master component for MaWo application is responsible for driving end-to-end 
> job execution. Its responsibility is
>  * Get Job definition and create a Queue of Tasks
>  * Assign Tasks to Worker
>  * Manage Workers lifecycle 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8598) Build Master Job Module for MaWo Application

2018-08-20 Thread Yesha Vora (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-8598:
-
Attachment: YARN-8598.001.patch

> Build Master Job Module for MaWo Application
> 
>
> Key: YARN-8598
> URL: https://issues.apache.org/jira/browse/YARN-8598
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: YARN-8598.001.patch
>
>
> A job in MaWo application is a collection of Tasks. A Job consists of a setup 
> task, a list of tasks and a teardown task. 
>  * JobBuilder
>  ** SimpleTaskJobBuilder : SimpleJobBuilder should be able to parse 
> simpleJobdescription file. In this file format, each line is considered as 
> Task
>  ** SimpleTaskJsonJobBuilder: Utility to parse json job description file. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8597) Build Worker utility for MaWo Application

2018-08-20 Thread Yesha Vora (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-8597:
-
Attachment: YARN-8597.001.patch

> Build Worker utility for MaWo Application
> -
>
> Key: YARN-8597
> URL: https://issues.apache.org/jira/browse/YARN-8597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: YARN-8597.001.patch
>
>
> The worker is responsible for executing Tasks. 
>  * Worker
>  ** Create a worker class which drives worker life cycle
>  ** Create WorkAssignment Protocol. It should be handle Register/deregister 
> worker, send heartbeat 
>  ** Lifecycle: Register worker, Run Setup Task, Get Task from master and 
> execute it using TaskRunner, Run Teardown Task
>  *  TaskRunner
>  ** Simple Task Runner : This runner should be able to execute a simple task
>  ** Composite Task Runner: This runner should be able to execute composite 
> task
>  * TaskWallTimeLimiter
>  ** Create a utility which can abort the task if the execution time exceeds 
> task timeout. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8597) Build Worker utility for MaWo Application

2018-08-20 Thread Yesha Vora (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yesha Vora updated YARN-8597:
-
Attachment: (was: YARN-8597.001.patch)

> Build Worker utility for MaWo Application
> -
>
> Key: YARN-8597
> URL: https://issues.apache.org/jira/browse/YARN-8597
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Yesha Vora
>Assignee: Yesha Vora
>Priority: Major
> Attachments: YARN-8597.001.patch
>
>
> The worker is responsible for executing Tasks. 
>  * Worker
>  ** Create a worker class which drives worker life cycle
>  ** Create WorkAssignment Protocol. It should be handle Register/deregister 
> worker, send heartbeat 
>  ** Lifecycle: Register worker, Run Setup Task, Get Task from master and 
> execute it using TaskRunner, Run Teardown Task
>  *  TaskRunner
>  ** Simple Task Runner : This runner should be able to execute a simple task
>  ** Composite Task Runner: This runner should be able to execute composite 
> task
>  * TaskWallTimeLimiter
>  ** Create a utility which can abort the task if the execution time exceeds 
> task timeout. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread Giovanni Matteo Fumarola (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586549#comment-16586549
 ] 

Giovanni Matteo Fumarola commented on YARN-8581:


Thanks [~botong] . Committed to trunk.

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, 
> YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586548#comment-16586548
 ] 

Hudson commented on YARN-8581:
--

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14806 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14806/])
YARN-8581. [AMRMProxy] Add sub-cluster timeout in (gifuma: rev 
e0f6ffdbad6f43fd43ec57fb68ebf5275b8b9ba0)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/utils/FederationStateStoreFacade.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/utils/FederationPoliciesTestUtil.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/amrmproxy/TestLocalityMulticastAMRMProxyPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/amrmproxy/LocalityMulticastAMRMProxyPolicy.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java


> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, 
> YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8298:

Attachment: YARN-8298.005.patch

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586541#comment-16586541
 ] 

genericqa commented on YARN-8298:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  9m 
49s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 6 new + 387 unchanged - 2 fixed = 393 total (was 389) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 58s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 
51s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 
48s{color} | {color:green} hadoop-yarn-services-core in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
49s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}127m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8298 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936309/YARN-8298.004.patch |
| Optional Tests |  asflicense

[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586539#comment-16586539
 ] 

genericqa commented on YARN-8581:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
 5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
16m 46s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
58s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
43s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
54s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m 45s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8581 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936314/YARN-8581.v2.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 8c46de123fc4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 8736fc3 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21636/testReport/ |
| Max. process+thread count | 301 (vs. ulimit of 1) |
| modules | C:

[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586493#comment-16586493
 ] 

genericqa commented on YARN-8581:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
41s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m  
8s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 57m 40s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 |
| JIRA Issue | YARN-8581 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936315/YARN-8581-branch-2.v2.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 503aaef576f0 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-2 / 18ebe18 |
| maven | version: Apache Maven 3.3.9 
(bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) |
| Default Java | 1.7.0_181 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21637/testReport/ |
| Max. process+thread count | 86 (vs. ulimit of 1) |
| modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: 
hadoop-yarn-project/hadoop-yarn |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21637/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> [AMRMProxy] Add sub-cluster timeout in

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586458#comment-16586458
 ] 

Chandni Singh commented on YARN-8298:
-

{quote}
We have built sufficient number of knobs for individual container to upgrade in 
a rolling fashion. However, it will depend on external orchestrator to perform 
the rolling upgrade. Express upgrade is design to be atomic, therefore, it 
simplifies upgrade process by doing all instances of a component in parallel. 
Docker container takes only a few second to stop and start, therefore, the 
interruption time is minimized to few seconds
{quote}
[~eyang] When an express upgrade is performed,  I am of the opinion that the 
upgrade of a single component should be done in a rolling fashion otherwise if 
there is a failure the service is disrupted. If we provide express upgrade that 
should be the default behavior. If upgrade of an instance fails, other 
instances of the component should not be tried to be upgraded.  Docker 
container may take a few second to stop and start but the other instances of 
the component will be active. 

Besides that, with the 2nd approach, I meant that the scheduler should not do 
any sort of orchestration including upgrading instances of a particular 
component before another. This is blocked by YARN-8665 as it needs support for 
cancelling an upgrade in case of failure.
Given that, if you want to go with the 2nd approach, then patch 4 contains all 
the changes.
 



> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-20 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8673:
---
Attachment: YARN-8673-branch-2.v2.patch

> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673-branch-2.v2.patch, YARN-8673.v1.patch, 
> YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586442#comment-16586442
 ] 

Eric Yang commented on YARN-8298:
-

[~csingh] Rolling upgrade on top of express upgrade is novelty idea, but most 
of the uninterrupted logic needs to come from software that hosted in the 
container rather than the upgrade framework itself.  It would be good to 
proceed with option 2.  We have built sufficient number of knobs for individual 
container to upgrade in a rolling fashion.  However, it will depend on external 
orchestrator to perform the rolling upgrade.  Express upgrade is design to be 
atomic, therefore, it simplifies upgrade process by doing all instances of a 
component in parallel.  Docker container takes only a few second to stop and 
start, therefore, the interruption time is minimized to few seconds.  By having 
both features built, user can choose one or the other.  This approach matches 
perfectly with Ambari definition of rolling upgrade, and express upgrade 
respectively.



> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-20 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586410#comment-16586410
 ] 

Hudson commented on YARN-8673:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14805 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14805/])
YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM (gifuma: 
rev 8736fc39ac3b3de168d2c216f3d1c0edb48fb3f9)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/AMRMClientUtils.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/AMRMClientRelayer.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedApplicationManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/TestAMRMClientRelayer.java


> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8581:
---
Attachment: YARN-8581-branch-2.v2.patch

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, 
> YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread Botong Huang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Botong Huang updated YARN-8581:
---
Attachment: YARN-8581.v2.patch

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581.v1.patch, YARN-8581.v2.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-20 Thread Giovanni Matteo Fumarola (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586390#comment-16586390
 ] 

Giovanni Matteo Fumarola edited comment on YARN-8673 at 8/20/18 7:24 PM:
-

LGTM +1. Committed to Trunk.
Thanks [~botong] .


was (Author: giovanni.fumarola):
LGTM +1. Committed to Trunk.

> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch

2018-08-20 Thread Giovanni Matteo Fumarola (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586390#comment-16586390
 ] 

Giovanni Matteo Fumarola commented on YARN-8673:


LGTM +1. Committed to Trunk.

> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ 
> will be thrown from the new YarnRM. AM will re-regsiter and reset the 
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ 
> follows the same protocol, and does the automatic re-register and responseId 
> resync. However, when exceptions or temporary network issue happens in the 
> allocate call after re-register, the resync logic might be broken. This patch 
> improves the robustness of the process by parsing the expected repsonseId 
> from YarnRM exception message. So that whenever the responseId is out of sync 
> for whatever reason, we can automatically resync and move on. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy

2018-08-20 Thread Giovanni Matteo Fumarola (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586384#comment-16586384
 ] 

Giovanni Matteo Fumarola commented on YARN-8581:


LGTM +1.

Do you mind rebase?

Hunk #3 FAILED at 145.
1 out of 3 hunks FAILED -- saving rejects to file 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/utils/FederationPoliciesTestUtil.java.rej

> [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
> ---
>
> Key: YARN-8581
> URL: https://issues.apache.org/jira/browse/YARN-8581
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: amrmproxy, federation
>Reporter: Botong Huang
>Assignee: Botong Huang
>Priority: Major
> Attachments: YARN-8581.v1.patch
>
>
> In Federation, every time an AM heartbeat comes in, 
> LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to 
> the list of active and enabled sub-clusters. However, if we haven't been able 
> to heartbeat to a sub-cluster for some time (network issues, or we keep 
> hitting some exception from YarnRM, or YarnRM master-slave switch is taking a 
> long time etc.), we should consider the sub-cluster as unhealthy and stop 
> routing asks there, until the heartbeat channel becomes healthy again. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8298:

Attachment: YARN-8298.004.patch

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch, YARN-8298.004.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586347#comment-16586347
 ] 

Chandni Singh edited comment on YARN-8298 at 8/20/18 6:29 PM:
--

[~eyang] Supporting orchestration of upgrade by ServiceMaster requires much 
more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665

There is another major problem with express upgrade. The instances of a single 
component should also be done one by one (rolling fashion). Otherwise, upgrade 
can cause disruption in component availability. One of the main reasons behind 
upgrade is that service is not disrupted. However if we upgrade all the 
instances in parallel, then failure in upgrade causes disruption.

There are 2 ways to proceed:
1. Support canceling upgrade first 
https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira
2. We merge this way of express upgrade where all instances are upgraded in 
parallel. It is a convenient way for dev testing. Work on YARN-8665 and then 
modify express upgrade.

I am fine with either way. 


was (Author: csingh):
[~eyang] Supporting orchestration of upgrade by ServiceMaster requires much 
more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665

There is another major problem with express upgrade. The instances of a single 
component should also be done one by one (rolling fashion). Otherwise, upgrade 
can cause disruption in component availability. One of the main reasons behind 
upgrade is that service is not disrupted. However if we upgrade all the 
instances in parallel, then failure in upgrade causes disruption.

There are 2 ways to proceed:
1. Support canceling upgrade first 
https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira
2. We merge this way of express upgrade where all instances are upgraded in 
parallel. It is a convenient way for dev testing.

I am fine with either way. 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586347#comment-16586347
 ] 

Chandni Singh commented on YARN-8298:
-

[~eyang] Supporting orchestration of upgrade by ServiceMaster requires much 
more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665

There is another major problem with express upgrade. The instances of a single 
component should also be done one by one (rolling fashion). Otherwise, upgrade 
can cause disruption in component availability. One of the main reasons behind 
upgrade is that service is not disrupted. However if we upgrade all the 
instances in parallel, then failure in upgrade causes disruption.

There are 2 ways to proceed:
1. Support canceling upgrade first 
https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira
2. We merge this way of express upgrade where all instances are upgraded in 
parallel. It is a convenient way for dev testing.

I am fine with either way. 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8632) No data in file realtimetrack.json after running SchedulerLoadSimulator

2018-08-20 Thread Yufei Gu (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586344#comment-16586344
 ] 

Yufei Gu commented on YARN-8632:


For that sake, we need to "setUncaughtExceptionHandler" for the thread, and 
provide a handler. Catching every exception in {{run()}} isn't enough. 

> No data in file realtimetrack.json after running SchedulerLoadSimulator
> ---
>
> Key: YARN-8632
> URL: https://issues.apache.org/jira/browse/YARN-8632
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: scheduler-load-simulator
>Reporter: Xianghao Lu
>Assignee: Xianghao Lu
>Priority: Major
> Attachments: YARN-8632-branch-2.7.2.001.patch, YARN-8632.001.patch, 
> YARN-8632.002.patch
>
>
> Recently, I have been using 
> [SchedulerLoadSimulator|https://hadoop.apache.org/docs/r2.7.2/hadoop-sls/SchedulerLoadSimulator.html]
>  to validate the impact of changes on my FairScheduler. I encountered some 
> problems.
>  Firstly, I fix a npe bug with the patch in 
> https://issues.apache.org/jira/browse/YARN-4302
>  Secondly, everything seems to be ok, but I just get "[]" in file 
> realtimetrack.json. Finally, I find the MetricsLogRunnable thread will exit 
> because of npe,
>  the reason is "wrapper.getQueueSet()" is still null when executing "String 
> metrics = web.generateRealTimeTrackingMetrics();"
>  So, we should put "String metrics = web.generateRealTimeTrackingMetrics();" 
> in try section to avoid MetricsLogRunnable thread exit with unexpected 
> exception. 
>  My hadoop version is 2.7.2, it seems that hadoop trunk branch also has the 
> second problem and I have made a patch to solve it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586330#comment-16586330
 ] 

Eric Yang commented on YARN-8298:
-

[~csingh] Patch 2 was working in progress.  It only figured out the dependency 
order, and submit the upgrade request.  The enhancement depends on 
ServiceScheduler to walk through the component list in a for loop and check 
upgrade status of each component before proceeding to the next component.  This 
will also meet your thinking to introduce ability to cancel or abort an upgrade.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193
 ] 

Chandni Singh edited comment on YARN-8298 at 8/20/18 6:02 PM:
--

[~eyang] Please see

{quote}
actionUpgradeExpress computes components order before calling backend. The 
logic is to upgrade component by component. Instance order is irrelevant within 
a component in this design. I think this match up fine with how we upgrade 
services today in Hadoop software.
{quote}
The code which you pasted does not  seem to upgrade component by component. 
1. The below line finds all the containers that will be sent to AM for upgrades
{code:java}
containersToUpgrade = ServiceApiUtil
  .validateAndResolveCompsUpgrade(persistedService, components);
{code}
 2.  The below request is sent to AM and the AM issues event to upgrade all the 
instances
{code}
CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
CompInstancesUpgradeRequestProto.newBuilder();
   upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
{code}
Instances are processing these events asynchronously, so all the instances will 
get upgraded without any order guarantees

Having to upgrade component by component based on dependencies could only be 
supported if we have support for canceling the upgrade when there is any 
failure. Also in case the upgrade is cancelled, we need a way for the user to 
check the status of the upgrade.


was (Author: csingh):
[~eyang] Please see

{quote}
actionUpgradeExpress computes components order before calling backend. The 
logic is to upgrade component by component. Instance order is irrelevant within 
a component in this design. I think this match up fine with how we upgrade 
services today in Hadoop software.
{quote}
The code which you pasted does not  seem to upgrade component by component. 
1. The below line finds all the containers that will be sent to AM for upgrades
{code:java}
containersToUpgrade = ServiceApiUtil
  .validateAndResolveCompsUpgrade(persistedService, components);
{code}
 2.  The below request is sent to AM and the AM issues event to upgrade all the 
instances
{code}
CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
CompInstancesUpgradeRequestProto.newBuilder();
   upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
{code}
Instances are processing these events asynchronously, so all the instances will 
get upgraded without any order guarantees

Having to upgrade component by component could only be supported if we have 
support for canceling the upgrade when there is any failure. Also in case the 
upgrade is cancelled, we need a way for the user to check the status of the 
upgrade.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8687) YARN service example is out-dated

2018-08-20 Thread Eric Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Yang updated YARN-8687:

Issue Type: Sub-task  (was: Bug)
Parent: YARN-7054

> YARN service example is out-dated
> -
>
> Key: YARN-8687
> URL: https://issues.apache.org/jira/browse/YARN-8687
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn-native-services
>Reporter: Eric Yang
>Priority: Major
>
> Example for YARN service is using file type "ENV".
> {code}
> {
>   "name": "httpd-service",
>   "version": "1.0",
>   "lifetime": "3600",
>   "components": [
> {
>   "name": "httpd",
>   "number_of_containers": 2,
>   "artifact": {
> "id": "centos/httpd-24-centos7:latest",
> "type": "DOCKER"
>   },
>   "launch_command": "/usr/bin/run-httpd",
>   "resource": {
> "cpus": 1,
> "memory": "1024"
>   },
>   "configuration": {
> "files": [
>   {
> "type": "TEMPLATE",
> "dest_file": "/var/www/html/index.html",
> "properties": {
>   "content": 
> "TitleHello from 
> ${COMPONENT_INSTANCE_NAME}!"
> }
>   }
> ]
>   }
> },
> {
>   "name": "httpd-proxy",
>   "number_of_containers": 1,
>   "artifact": {
> "id": "centos/httpd-24-centos7:latest",
> "type": "DOCKER"
>   },
>   "launch_command": "/usr/bin/run-httpd",
>   "resource": {
> "cpus": 1,
> "memory": "1024"
>   },
>   "configuration": {
> "files": [
>   {
> "type": "TEMPLATE",
> "dest_file": "/etc/httpd/conf.d/httpd-proxy.conf",
> "src_file": "httpd-proxy.conf"
>   }
> ]
>   }
> }
>   ],
>   "quicklinks": {
> "Apache HTTP Server": 
> "http://httpd-proxy-0.${SERVICE_NAME}.${USER}.${DOMAIN}:8080;
>   }
> }
> {code}
> The type has changed to "TEMPLATE" in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8687) YARN service example is out-dated

2018-08-20 Thread Eric Yang (JIRA)

Eric Yang created YARN-8687:
---

 Summary: YARN service example is out-dated
 Key: YARN-8687
 URL: https://issues.apache.org/jira/browse/YARN-8687
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
Reporter: Eric Yang


Example for YARN service is using file type "ENV".

{code}
{
  "name": "httpd-service",
  "version": "1.0",
  "lifetime": "3600",
  "components": [
{
  "name": "httpd",
  "number_of_containers": 2,
  "artifact": {
"id": "centos/httpd-24-centos7:latest",
"type": "DOCKER"
  },
  "launch_command": "/usr/bin/run-httpd",
  "resource": {
"cpus": 1,
"memory": "1024"
  },
  "configuration": {
"files": [
  {
"type": "TEMPLATE",
"dest_file": "/var/www/html/index.html",
"properties": {
  "content": 
"TitleHello from 
${COMPONENT_INSTANCE_NAME}!"
}
  }
]
  }
},
{
  "name": "httpd-proxy",
  "number_of_containers": 1,
  "artifact": {
"id": "centos/httpd-24-centos7:latest",
"type": "DOCKER"
  },
  "launch_command": "/usr/bin/run-httpd",
  "resource": {
"cpus": 1,
"memory": "1024"
  },
  "configuration": {
"files": [
  {
"type": "TEMPLATE",
"dest_file": "/etc/httpd/conf.d/httpd-proxy.conf",
"src_file": "httpd-proxy.conf"
  }
]
  }
}
  ],
  "quicklinks": {
"Apache HTTP Server": 
"http://httpd-proxy-0.${SERVICE_NAME}.${USER}.${DOMAIN}:8080;
  }
}
{code}

The type has changed to "TEMPLATE" in the code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193
 ] 

Chandni Singh edited comment on YARN-8298 at 8/20/18 4:45 PM:
--

[~eyang] Please see

{quote}
actionUpgradeExpress computes components order before calling backend. The 
logic is to upgrade component by component. Instance order is irrelevant within 
a component in this design. I think this match up fine with how we upgrade 
services today in Hadoop software.
{quote}
The code which you pasted does not  seem to upgrade component by component. 
1. The below line finds all the containers that will be sent to AM for upgrades
{code:java}
containersToUpgrade = ServiceApiUtil
  .validateAndResolveCompsUpgrade(persistedService, components);
{code}
 2.  The below request is sent to AM and the AM issues event to upgrade all the 
instances
{code}
CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
CompInstancesUpgradeRequestProto.newBuilder();
   upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
{code}
Instances are processing these events asynchronously, so all the instances will 
get upgraded without any order guarantees

Having to upgrade component by component could only be supported if we have 
support for canceling the upgrade when there is any failure. Also in case the 
upgrade is cancelled, we need a way for the user to check the status of the 
upgrade.


was (Author: csingh):
[~eyang] Please see

{quote}
actionUpgradeExpress computes components order before calling backend. The 
logic is to upgrade component by component. Instance order is irrelevant within 
a component in this design. I think this match up fine with how we upgrade 
services today in Hadoop software.
{quote}
The code which you pasted does not  seem to upgrade component by component. 
1. The below line finds all the containers that will be sent to AM for upgrades
{code:java}
containersToUpgrade = ServiceApiUtil
  .validateAndResolveCompsUpgrade(persistedService, components);
{code}
 2.  The below request is sent to AM and the AM issues event to upgrade all the 
instances
{code}
CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
CompInstancesUpgradeRequestProto.newBuilder();
   upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
{code}

Having to upgrade component by component could only be supported if we have 
support for canceling the upgrade when there is any failure. Also in case the 
upgrade is cancelled, we need a way for the user to check the status of the 
upgrade.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193
 ] 

Chandni Singh commented on YARN-8298:
-

[~eyang] Please see

{quote}
actionUpgradeExpress computes components order before calling backend. The 
logic is to upgrade component by component. Instance order is irrelevant within 
a component in this design. I think this match up fine with how we upgrade 
services today in Hadoop software.
{quote}
The code which you pasted does not  seem to upgrade component by component. 
1. The below line finds all the containers that will be sent to AM for upgrades
{code:java}
containersToUpgrade = ServiceApiUtil
  .validateAndResolveCompsUpgrade(persistedService, components);
{code}
 2.  The below request is sent to AM and the AM issues event to upgrade all the 
instances
{code}
CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
CompInstancesUpgradeRequestProto.newBuilder();
   upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
{code}

Having to upgrade component by component could only be supported if we have 
support for canceling the upgrade when there is any failure. Also in case the 
upgrade is cancelled, we need a way for the user to check the status of the 
upgrade.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586180#comment-16586180
 ] 

Eric Yang commented on YARN-8298:
-

[~csingh] {quote}
Even though the patch 2 referenced resolveCompsDependency, it wasn't being used 
to orchestrate upgrade of comp instances in any way.
{quote}

Quote from patch 2:
{code}
+  public int actionUpgradeExpress(Service service) throws YarnException,
+  IOException {
+int retries = 0;
+ApplicationReport appReport = upgradePrecheck(service);
+List components = ServiceApiUtil.resolveCompsDependency(service);
+LOG.info("Upgrading {} with component list order: {}",
+service.getName(), components);
+ClientAMProtocol proxy = createAMProxy(service.getName(), appReport);
+UpgradeServiceRequestProto.Builder requestBuilder =
+UpgradeServiceRequestProto.newBuilder();
+requestBuilder.setVersion(service.getVersion());
+if (service.getState().equals(ServiceState.UPGRADING_AUTO_FINALIZE)) {
+  requestBuilder.setAutoFinalize(true);
+}
+UpgradeServiceResponseProto responseProto = proxy.upgrade(
+requestBuilder.build());
+if (responseProto.hasError()) {
+  LOG.error("Service {} upgrade to version {} failed because {}",
+  service.getName(), service.getVersion(), responseProto.getError());
+  throw new YarnException("Failed to upgrade service " + service.getName()
+  + " to version " + service.getVersion() + " because " +
+  responseProto.getError());
+}
+
+Service persistedService = getStatus(service.getName());
+List containersToUpgrade = null;
+List containerIdsToUpgrade = new ArrayList<>();
+// AM cache changes might take a few seconds
+while (retries < 30) {
+  try {
+persistedService = getStatus(service.getName());
+retries++;
+containersToUpgrade = ServiceApiUtil
+.validateAndResolveCompsUpgrade(persistedService, components);
+  } catch (YarnException e) {
+LOG.info("Waiting for service to become ready for upgrade, retries: {} 
/ 30", retries);
+try {
+  Thread.sleep(3000L);
+} catch (InterruptedException ie) {
+}
+  }
+}
+if (containersToUpgrade == null) {
+  LOG.error("No containers to upgrade.");
+  return EXIT_FALSE;
+}
+containersToUpgrade
+.forEach(compInst -> containerIdsToUpgrade.add(compInst.getId()));
+LOG.info("instances to upgrade {}", containerIdsToUpgrade);
+CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder =
+CompInstancesUpgradeRequestProto.newBuilder();
+upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade);
+proxy.upgrade(upgradeRequestBuilder.build());
+return EXIT_SUCCESS;
+  }
{code}

actionUpgradeExpress computes components order before calling backend.  The 
logic is to upgrade component by component.  Instance order is irrelevant 
within a component in this design.  I think this match up fine with how we 
upgrade services today in Hadoop software.

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586154#comment-16586154
 ] 

Chandni Singh edited comment on YARN-8298 at 8/20/18 4:30 PM:
--

[~eyang] 

Even though the patch 2 referenced resolveCompsDependency, it wasn't being used 
to orchestrate upgrade of comp instances in any way.

 If we need to orchestrate upgrade of instances based on component dependency, 
it requires the support to cancel the current upgrade. For, example if upgrade 
of compA fails, then compB upgrade will not be triggered as well. 


was (Author: csingh):
[~eyang] 

Even though the patch 2 referenced resolveCompsDependency, it wasn't being used 
to orchestrate upgrade of comp instances in any way.

 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header

2018-08-20 Thread Akhil PB (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8686:
---
Priority: Critical  (was: Major)

> Queue Management API - not returning JSON or XML response data when passing 
> Accept header
> -
>
> Key: YARN-8686
> URL: https://issues.apache.org/jira/browse/YARN-8686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Critical
>
> API should return JSON or XML response data based on Accept header. Instead, 
> API returns plain text for success as well as error scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586154#comment-16586154
 ] 

Chandni Singh commented on YARN-8298:
-

[~eyang] 

Even though the patch 2 referenced resolveCompsDependency, it wasn't being used 
to orchestrate upgrade of comp instances in any way.

 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8242) YARN NM: OOM error while reading back the state store on recovery

2018-08-20 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586123#comment-16586123
 ] 

Hudson commented on YARN-8242:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14804 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/14804/])
YARN-8242. YARN NM: OOM error while reading back the state store on (jlowe: rev 
65e7469712be6cf393e29ef73cc94727eec81227)
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java
* (edit) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java
* (add) 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/RecoveryIterator.java


> YARN NM: OOM error while reading back the state store on recovery
> -
>
> Key: YARN-8242
> URL: https://issues.apache.org/jira/browse/YARN-8242
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.6.0, 2.9.0, 2.6.5, 2.8.3, 3.1.0, 2.7.6, 3.0.2
>Reporter: Kanwaljeet Sachdev
>Assignee: Pradeep Ambati
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8242.001.patch, YARN-8242.002.patch, 
> YARN-8242.003.patch, YARN-8242.004.patch, YARN-8242.005.patch, 
> YARN-8242.006.patch, YARN-8242.007.patch, YARN-8242.008.patch
>
>
> On startup the NM reads its state store and builds a list of application in 
> the state store to process. If the number of applications in the state store 
> is large and have a lot of "state" connected to it the NM can run OOM and 
> never get to the point that it can start processing the recovery.
> Since it never starts the recovery there is no way for the NM to ever pass 
> this point. It will require a change in heap size to get the NM started.
>  
> Following is the stack trace
> {code:java}
> at java.lang.OutOfMemoryError. (OutOfMemoryError.java:48) at 
> com.google.protobuf.ByteString.copyFrom (ByteString.java:192) at 
> com.google.protobuf.CodedInputStream.readBytes (CodedInputStream.java:324) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. 
> (YarnProtos.java:47069) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. 
> (YarnProtos.java:47014) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom
>  (YarnProtos.java:47102) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom
>  (YarnProtos.java:47097) at com.google.protobuf.CodedInputStream.readMessage 
> (CodedInputStream.java:309) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. 
> (YarnProtos.java:41016) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. 
> (YarnProtos.java:40942) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom
>  (YarnProtos.java:41080) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom
>  (YarnProtos.java:41075) at

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586120#comment-16586120
 ] 

Eric Yang commented on YARN-8298:
-

[~csingh] Thank you for the patch.  In patch 3, it doesn't reference to 
resolveCompsDependency in the ServiceMaster upgrade logic.  How does the 
component dependencies get resolved?

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8242) YARN NM: OOM error while reading back the state store on recovery

2018-08-20 Thread Pradeep Ambati (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586119#comment-16586119
 ] 

Pradeep Ambati commented on YARN-8242:
--

Thanks @jlowe for reviewing and commiting the patch.

> YARN NM: OOM error while reading back the state store on recovery
> -
>
> Key: YARN-8242
> URL: https://issues.apache.org/jira/browse/YARN-8242
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 2.6.0, 2.9.0, 2.6.5, 2.8.3, 3.1.0, 2.7.6, 3.0.2
>Reporter: Kanwaljeet Sachdev
>Assignee: Pradeep Ambati
>Priority: Critical
> Fix For: 3.2.0, 3.1.2
>
> Attachments: YARN-8242.001.patch, YARN-8242.002.patch, 
> YARN-8242.003.patch, YARN-8242.004.patch, YARN-8242.005.patch, 
> YARN-8242.006.patch, YARN-8242.007.patch, YARN-8242.008.patch
>
>
> On startup the NM reads its state store and builds a list of application in 
> the state store to process. If the number of applications in the state store 
> is large and have a lot of "state" connected to it the NM can run OOM and 
> never get to the point that it can start processing the recovery.
> Since it never starts the recovery there is no way for the NM to ever pass 
> this point. It will require a change in heap size to get the NM started.
>  
> Following is the stack trace
> {code:java}
> at java.lang.OutOfMemoryError. (OutOfMemoryError.java:48) at 
> com.google.protobuf.ByteString.copyFrom (ByteString.java:192) at 
> com.google.protobuf.CodedInputStream.readBytes (CodedInputStream.java:324) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. 
> (YarnProtos.java:47069) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. 
> (YarnProtos.java:47014) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom
>  (YarnProtos.java:47102) at 
> org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom
>  (YarnProtos.java:47097) at com.google.protobuf.CodedInputStream.readMessage 
> (CodedInputStream.java:309) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. 
> (YarnProtos.java:41016) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. 
> (YarnProtos.java:40942) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom
>  (YarnProtos.java:41080) at 
> org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom
>  (YarnProtos.java:41075) at com.google.protobuf.CodedInputStream.readMessage 
> (CodedInputStream.java:309) at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto.
>  (YarnServiceProtos.java:24517) at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto.
>  (YarnServiceProtos.java:24464) at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto$1.parsePartialFrom
>  (YarnServiceProtos.java:24568) at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto$1.parsePartialFrom
>  (YarnServiceProtos.java:24563) at 
> com.google.protobuf.AbstractParser.parsePartialFrom (AbstractParser.java:141) 
> at com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:176) at 
> com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:188) at 
> com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:193) at 
> com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:49) at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto.parseFrom
>  (YarnServiceProtos.java:24739) at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState
>  (NMLeveldbStateStoreService.java:217) at 
> org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState
>  (NMLeveldbStateStoreService.java:170) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover
>  (ContainerManagerImpl.java:253) at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit
>  (ContainerManagerImpl.java:237) at 
> org.apache.hadoop.service.AbstractService.init (AbstractService.java:163) at 
> org.apache.hadoop.service.CompositeService.serviceInit 
> (CompositeService.java:107) at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit 
> (NodeManager.java:255) at org.apache.hadoop.service.AbstractService.init 
> (AbstractService.java:163) at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager 
> (NodeManager.java:474) at 
> org.apache.hadoop.yarn.server.nodemanager.NodeManager.main 
> (NodeManager.java:521){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker

2018-08-20 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586076#comment-16586076
 ] 

Eric Yang commented on YARN-8648:
-

[~Jim_Brennan] {quote}
I think this is mitigated if we use the "cgroup" section of the 
container-exectutor.cfg to constrain it.  This is currently used to enable 
updating params, but I think it could be used for this as well.It already 
defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, 
hadoop-yarn).  We could either add another config parameter to define the list 
of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can 
parse /proc/mounts to determine the full list.  I think it's safer to add the 
config parameter.
{quote}

The proposal looks good.  As long as the default list can match docker's cgroup 
usage when docker is enabled instead of being empty list.  I think this 
solution can work.

> Container cgroups are leaked when using docker
> --
>
> Key: YARN-8648
> URL: https://issues.apache.org/jira/browse/YARN-8648
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups 
> for all resources on the system, not just for cpu.  For instance, if the 
> {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, 
> the nodemanager will create a cgroup for each container under 
> {{/sys/fs/cgroup/cpu/hadoop-yarn}}.  In the docker case, we pass this path 
> via the {{--cgroup-parent}} command line argument.   Docker then creates a 
> cgroup for the docker container under that, for instance: 
> {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}} 
> cgroup, and the nodemanager cleans up the {{container_id}} cgroup,   All is 
> good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every 
> resource under {{/sys/fs/cgroup}}.  On the rhel7 system I am using, these 
> are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, 
> perf_event, and systemd.So for instance, docker creates 
> {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but 
> it only cleans up the leaf cgroup {{docker_container_id}}.  Nobody cleans up 
> the {{container_id}} cgroups for these other resources.  On one of our busy 
> clusters, we found > 100,000 of these leaked cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to 
> repro with current hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker

2018-08-20 Thread Jim Brennan (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586060#comment-16586060
 ] 

Jim Brennan commented on YARN-8648:
---

Thanks [~eyang]!  My main concern about the minimal fix is the security aspect, 
since we will need to add an option to container-executor to tell it to delete 
all cgroups with a particular name as root (since docker will create them as 
root).

I think this is mitigated if we use the "cgroup" section of the 
container-exectutor.cfg to constrain it.  This is currently used to enable 
updating params, but I think it could be used for this as well.    It already 
defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, 
hadoop-yarn).  We could either add another config parameter to define the list 
of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can 
parse /proc/mounts to determine the full list.  I think it's safer to add the 
config parameter.

I will start working on this version unless there are objections?

> Container cgroups are leaked when using docker
> --
>
> Key: YARN-8648
> URL: https://issues.apache.org/jira/browse/YARN-8648
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
>  Labels: Docker
>
> When you run with docker and enable cgroups for cpu, docker creates cgroups 
> for all resources on the system, not just for cpu.  For instance, if the 
> {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, 
> the nodemanager will create a cgroup for each container under 
> {{/sys/fs/cgroup/cpu/hadoop-yarn}}.  In the docker case, we pass this path 
> via the {{--cgroup-parent}} command line argument.   Docker then creates a 
> cgroup for the docker container under that, for instance: 
> {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}.
> When the container exits, docker cleans up the {{docker_container_id}} 
> cgroup, and the nodemanager cleans up the {{container_id}} cgroup,   All is 
> good under {{/sys/fs/cgroup/hadoop-yarn}}.
> The problem is that docker also creates that same hierarchy under every 
> resource under {{/sys/fs/cgroup}}.  On the rhel7 system I am using, these 
> are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, 
> perf_event, and systemd.So for instance, docker creates 
> {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but 
> it only cleans up the leaf cgroup {{docker_container_id}}.  Nobody cleans up 
> the {{container_id}} cgroups for these other resources.  On one of our busy 
> clusters, we found > 100,000 of these leaked cgroups.
> I found this in our 2.8-based version of hadoop, but I have been able to 
> repro with current hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8664) ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585921#comment-16585921
 ] 

genericqa commented on YARN-8664:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m  
6s{color} | {color:red} Docker failed to build yetus/hadoop:749e106. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | YARN-8664 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936272/YARN-8664-branch-2.8.01.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-YARN-Build/21634/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting
> -
>
> Key: YARN-8664
> URL: https://issues.apache.org/jira/browse/YARN-8664
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.2
> Environment: 
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-8664-branch-2.8.001.pathch, 
> YARN-8664-branch-2.8.01.patch, YARN-8664-branch-2.8.2.001.patch, 
> YARN-8664-branch-2.8.2.002.patch
>
>
> ResourceManager logs about exception is:
> {code:java}
> 2018-08-09 00:52:30,746 WARN [IPC Server handler 5 on 8030] 
> org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8030, call Call#305638 
> Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 
> 11.13.73.101:51083
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto.isInitialized(YarnProtos.java:6402)
>         at 
> org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto$Builder.build(YarnProtos.java:6642)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.mergeLocalToProto(ResourcePBImpl.java:254)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getProto(ResourcePBImpl.java:61)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.convertToProtoFormat(NodeReportPBImpl.java:313)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:264)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.convertToProtoFormat(AllocateResponsePBImpl.java:714)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.access$400(AllocateResponsePBImpl.java:69)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:680)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:669)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323)
>         at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$AllocateResponseProto$Builder.addAllUpdatedNodes(YarnServiceProtos.java:12846)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToBuilder(AllocateResponsePBImpl.java:145)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToProto(AllocateResponsePBImpl.java:176)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.getProto(AllocateResponsePBImpl.java:97)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:61)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>

[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585910#comment-16585910
 ] 

Weiwei Yang commented on YARN-8685:
---

Hi [~Tao Yang]

Does it make more sense to extend NM containers rest API for this? E.g add a 
filter (by state). If we maintain two classes for container info, we'll end up 
modifying both if anything changed. Better to avoid that. What do you think?

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585907#comment-16585907
 ] 

genericqa commented on YARN-8685:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
53s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  4s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 
42 unchanged - 0 fixed = 46 total (was 42) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 53s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
42s{color} | {color:green} hadoop-yarn-server-router in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
34s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}150m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8685 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936257/YARN-8685.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 321bb3677aa6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Weiwei Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8685:
--
Issue Type: Improvement  (was: Bug)

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header

2018-08-20 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585898#comment-16585898
 ] 

Weiwei Yang commented on YARN-8686:
---

Hmm, it is not? I thought I used this before. Have you tried "contentType" 
header?

> Queue Management API - not returning JSON or XML response data when passing 
> Accept header
> -
>
> Key: YARN-8686
> URL: https://issues.apache.org/jira/browse/YARN-8686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Major
>
> API should return JSON or XML response data based on Accept header. Instead, 
> API returns plain text for success as well as error scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8664) ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting

2018-08-20 Thread Weiwei Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated YARN-8664:
--
Attachment: YARN-8664-branch-2.8.01.patch

> ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting
> -
>
> Key: YARN-8664
> URL: https://issues.apache.org/jira/browse/YARN-8664
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: resourcemanager
>Affects Versions: 2.8.2
> Environment: 
>Reporter: Jiandan Yang 
>Assignee: Jiandan Yang 
>Priority: Major
> Attachments: YARN-8664-branch-2.8.001.pathch, 
> YARN-8664-branch-2.8.01.patch, YARN-8664-branch-2.8.2.001.patch, 
> YARN-8664-branch-2.8.2.002.patch
>
>
> ResourceManager logs about exception is:
> {code:java}
> 2018-08-09 00:52:30,746 WARN [IPC Server handler 5 on 8030] 
> org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8030, call Call#305638 
> Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from 
> 11.13.73.101:51083
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto.isInitialized(YarnProtos.java:6402)
>         at 
> org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto$Builder.build(YarnProtos.java:6642)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.mergeLocalToProto(ResourcePBImpl.java:254)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getProto(ResourcePBImpl.java:61)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.convertToProtoFormat(NodeReportPBImpl.java:313)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:264)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287)
>         at 
> org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.convertToProtoFormat(AllocateResponsePBImpl.java:714)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.access$400(AllocateResponsePBImpl.java:69)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:680)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:669)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336)
>         at 
> com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323)
>         at 
> org.apache.hadoop.yarn.proto.YarnServiceProtos$AllocateResponseProto$Builder.addAllUpdatedNodes(YarnServiceProtos.java:12846)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToBuilder(AllocateResponsePBImpl.java:145)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToProto(AllocateResponsePBImpl.java:176)
>         at 
> org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.getProto(AllocateResponsePBImpl.java:97)
>         at 
> org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:61)
>         at 
> org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1804)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2457)
> {code}
> ApplicationMasterService#allocate will call AllocateResponse#setUpdatedNodes 
> when NM losting, and AllocateResponse#getProto will call 
> ResourceBPImpl#getProto to transform NodeReportPBImpl#capacity into format of 
> PB . Because ResourcePBImpl is not thread safe and 
> multiple AM will call allocate at the same time, ResourcePBImpl#getProto may 
> throw NullPointerException or UnsupportedOperationException.
> I wrote a test code which can reproduce exception.
> {code:java}
> @Test
>   public void testResource1() throws InterruptedException {
>

[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585870#comment-16585870
 ] 

Weiwei Yang commented on YARN-8683:
---

Hi [~Tao Yang]

Thanks for the patch, it looks good. Do you have a screenshot how this looks 
like on the UI ? I would like to take a look. Just want to make sure the look 
and feel won't confuse people that not using {{SchedulingRequest}}. Thanks!

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-20 Thread Weiwei Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585857#comment-16585857
 ] 

Weiwei Yang commented on YARN-7494:
---

 Hi [~sunilg], looks good, +1 once the checkstyle issues are fixed.

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, 
> multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585846#comment-16585846
 ] 

genericqa commented on YARN-8683:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 31s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 38s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
39s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer
 |
|   | 
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSchedulingRequestUpdate
 |
|   | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler |
|   | hadoop.yarn.server.resourcemanager.TestApplicationMasterService |
|   | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8683 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936254/YARN-8683.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a8b161ef3702 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e3d73bb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| unit |

[jira] [Updated] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header

2018-08-20 Thread Akhil PB (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akhil PB updated YARN-8686:
---
Summary: Queue Management API - not returning JSON or XML response data 
when passing Accept header  (was: Queue Management API - not returning JSON or 
XML response data when passing accept header)

> Queue Management API - not returning JSON or XML response data when passing 
> Accept header
> -
>
> Key: YARN-8686
> URL: https://issues.apache.org/jira/browse/YARN-8686
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: yarn
>Reporter: Akhil PB
>Assignee: Akhil PB
>Priority: Major
>
> API should return JSON or XML response data based on Accept header. Instead, 
> API returns plain text for success as well as error scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing accept header

2018-08-20 Thread Akhil PB (JIRA)

Akhil PB created YARN-8686:
--

 Summary: Queue Management API - not returning JSON or XML response 
data when passing accept header
 Key: YARN-8686
 URL: https://issues.apache.org/jira/browse/YARN-8686
 Project: Hadoop YARN
  Issue Type: Sub-task
  Components: yarn
Reporter: Akhil PB
Assignee: Akhil PB


API should return JSON or XML response data based on Accept header. Instead, 
API returns plain text for success as well as error scenarios.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8129) Improve error message for invalid value in fields attribute

2018-08-20 Thread Abhishek Modi (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585825#comment-16585825
 ] 

Abhishek Modi commented on YARN-8129:
-

Thanks [~suma.shivaprasad] for the review. 

[~suma.shivaprasad] [~rohithsharma] [~vrushalic] could you please commit it if 
it looks good.

> Improve error message for invalid value in fields attribute
> ---
>
> Key: YARN-8129
> URL: https://issues.apache.org/jira/browse/YARN-8129
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: ATSv2
>Reporter: Charan Hebri
>Assignee: Abhishek Modi
>Priority: Minor
> Attachments: YARN-8129.001.patch
>
>
> Query with invalid values for the 'fields' attributes throws a message that 
> isn't very informative.
> Reader log,
> {noformat}
> 2018-04-09 08:59:46,069 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:getEntities(595)) - Received URL 
> /ws/v2/timeline/users/hrt_qa/flows/test_flow/apps?limit=3=INFOS from 
> user hrt_qa
> 2018-04-09 08:59:46,070 INFO  reader.TimelineReaderWebServices 
> (TimelineReaderWebServices.java:handleException(173)) - Processed URL 
> /ws/v2/timeline/users/hrt_qa/flows/test_flow/apps?limit=3=INFOS but 
> encountered exception (Took 1 ms.){noformat}
> Here INFOS is the invalid value for the fields attribute.
> Response,
> {noformat}
> {
>   "exception": "BadRequestException",
>   "message": "java.lang.Exception: No enum constant 
> org.apache.hadoop.yarn.server.timelineservice.storage.TimelineReader.Field.INFOS",
>   "javaClassName": "org.apache.hadoop.yarn.webapp.BadRequestException"
> }{noformat}
> The message shouldn't ideally contain the enum information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585789#comment-16585789
 ] 

genericqa commented on YARN-7494:
-

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
53s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 17s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 35 new + 671 unchanged - 4 fixed = 706 total (was 675) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 
46s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}129m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-7494 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936240/YARN-7494.18.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 33645c430996 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / e3d73bb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21631/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/21631/testReport/ |
| Max. process+thread count | 903 (vs. ulimit of 1) |
| modules | C: 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager
 U:

[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585762#comment-16585762
 ] 

Tao Yang commented on YARN-8685:


Attached v1 patch for review. 

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8685:
---
Attachment: YARN-8685.001.patch

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8685.001.patch
>
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8683:
---
Attachment: YARN-8683.002.patch

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585709#comment-16585709
 ] 

Tao Yang commented on YARN-8683:


The failed UT seems unrelated to this patch.  Attached v2 patch to correct the 
checkstyle.

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8683:
---
Attachment: (was: YARN-8683.002.patch)

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8683:
---
Attachment: YARN-8683.002.patch

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch, YARN-8683.002.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8685:
---
Description: Currently we can only query running containers from NM 
containers REST API, but can't get the valid containers which are in 
ALLOCATED/ACQUIRED state. We have the requirements to get all containers 
allocated on specified nodes for debugging. I want to add a "includeContainers" 
query param (default false) for nodes/node REST API in RMWebServices, so that 
we can get valid containers on nodes if "includeContainers=true" specified.  
(was: Currently we can only query running containers from NM containers REST 
API, but can't get the valid containers which are in ALLOCATED/ACQUIRED state. 
We have the requirements to get all containers allocated on specified nodes for 
debugging or managing. I think we can add a "includeContainers" query param 
(default false) for nodes/node REST API in RMWebServices, so that we can get 
valid containers on nodes if "includeContainers=true" specified.)

> Add containers query support for nodes/node REST API in RMWebServices
> -
>
> Key: YARN-8685
> URL: https://issues.apache.org/jira/browse/YARN-8685
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: restapi
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
>
> Currently we can only query running containers from NM containers REST API, 
> but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We 
> have the requirements to get all containers allocated on specified nodes for 
> debugging. I want to add a "includeContainers" query param (default false) 
> for nodes/node REST API in RMWebServices, so that we can get valid containers 
> on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices

2018-08-20 Thread Tao Yang (JIRA)

Tao Yang created YARN-8685:
--

 Summary: Add containers query support for nodes/node REST API in 
RMWebServices
 Key: YARN-8685
 URL: https://issues.apache.org/jira/browse/YARN-8685
 Project: Hadoop YARN
  Issue Type: Bug
  Components: restapi
Affects Versions: 3.2.0
Reporter: Tao Yang
Assignee: Tao Yang


Currently we can only query running containers from NM containers REST API, but 
can't get the valid containers which are in ALLOCATED/ACQUIRED state. We have 
the requirements to get all containers allocated on specified nodes for 
debugging or managing. I think we can add a "includeContainers" query param 
(default false) for nodes/node REST API in RMWebServices, so that we can get 
valid containers on nodes if "includeContainers=true" specified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585662#comment-16585662
 ] 

genericqa commented on YARN-8683:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 37s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 3 new + 126 unchanged - 0 fixed = 129 total (was 126) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 27s{color} 
| {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}130m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 |
| JIRA Issue | YARN-8683 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12936228/YARN-8683.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 4b6344a3ccc5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 4aacbff |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/21630/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
| unit |

[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement

2018-08-20 Thread Sunil Govindan (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585620#comment-16585620
 ] 

Sunil Govindan commented on YARN-7494:
--

Fixed test case. Attaching new patch,

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, 
> multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-7494) Add muti node lookup support for better placement

2018-08-20 Thread Sunil Govindan (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan updated YARN-7494:
-
Attachment: YARN-7494.18.patch

> Add muti node lookup support for better placement
> -
>
> Key: YARN-7494
> URL: https://issues.apache.org/jira/browse/YARN-7494
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: capacity scheduler
>Reporter: Sunil Govindan
>Assignee: Sunil Govindan
>Priority: Major
> Attachments: YARN-7494.001.patch, YARN-7494.002.patch, 
> YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, 
> YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, 
> YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, 
> YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, 
> YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, 
> YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, 
> multi-node-designProposal.png
>
>
> Instead of single node, for effectiveness we can consider a multi node lookup 
> based on partition to start with.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread genericqa (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585607#comment-16585607
 ] 

genericqa commented on YARN-8298:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
8s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  7m 
12s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  7s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch 
generated 9 new + 136 unchanged - 0 fixed = 145 total (was 136) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 52s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
43s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
40s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common 
generated 1 new + 4189 unchanged - 0 fixed = 4190 total (was 4189) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
13s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 17s{color} 
| {color:red} hadoop-yarn-client in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 41s{color} 
| {color:red} hadoop-yarn-services-core in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
50s{color} | {color:green} hadoop-yarn-services-api in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
33s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}120m 17s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.yarn.client.cli.TestYarnCLI |
|   | hadoop.yarn.service.TestServiceManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |

[jira] [Created] (YARN-8684) Support for setting priority for services in spec file

2018-08-20 Thread Rohith Sharma K S (JIRA)

Rohith Sharma K S created YARN-8684:
---

 Summary: Support for setting priority for services in spec file
 Key: YARN-8684
 URL: https://issues.apache.org/jira/browse/YARN-8684
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Rohith Sharma K S


YARN service spec file doesn't allow setting priority. It would be nice if yarn 
services allows to set priority so that some of critical services such as 
system-services gets higher preference



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-8683:
---
Attachment: YARN-8683.001.patch

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585487#comment-16585487
 ] 

Tao Yang commented on YARN-8683:


Attached v1 patch for review.
Updates:
(1) Total outstanding resource requests info on app attempt page: add pending 
scheduling requests and "ExecutionType"/"AllocationTags"/"PlacementConstraint" 
columns 
(2) Remove redundant fields ("executionType" & "enforceExecutionType") which 
are replaced with executionTypeRequest in ResourceRequestInfo

> Support scheduling request for outstanding requests info in RMAppAttemptBlock
> -
>
> Key: YARN-8683
> URL: https://issues.apache.org/jira/browse/YARN-8683
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: webapp
>Affects Versions: 3.2.0
>Reporter: Tao Yang
>Assignee: Tao Yang
>Priority: Major
> Attachments: YARN-8683.001.patch
>
>
> Currently outstanding requests info in app attempt page only show pending 
> resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Created] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock

2018-08-20 Thread Tao Yang (JIRA)

Tao Yang created YARN-8683:
--

 Summary: Support scheduling request for outstanding requests info 
in RMAppAttemptBlock
 Key: YARN-8683
 URL: https://issues.apache.org/jira/browse/YARN-8683
 Project: Hadoop YARN
  Issue Type: Bug
  Components: webapp
Affects Versions: 3.2.0
Reporter: Tao Yang
Assignee: Tao Yang


Currently outstanding requests info in app attempt page only show pending 
resource requests, pending scheduling requests should be shown here too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



[ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585469#comment-16585469
 ] 

Chandni Singh commented on YARN-8298:
-

Patch 3 adds {{EXPRESS_UPGRADING}} state that will let the ServiceMaster know 
to perform upgrade of all the instances of the service.

 

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service

2018-08-20 Thread Chandni Singh (JIRA)



 [ 
https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandni Singh updated YARN-8298:

Attachment: YARN-8298.003.patch

> Yarn Service Upgrade: Support express upgrade of a service
> --
>
> Key: YARN-8298
> URL: https://issues.apache.org/jira/browse/YARN-8298
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Affects Versions: 3.1.1
>Reporter: Chandni Singh
>Assignee: Chandni Singh
>Priority: Major
> Attachments: YARN-8298.001.patch, YARN-8298.002.patch, 
> YARN-8298.003.patch
>
>
> Currently service upgrade involves 2 steps
>  * initiate upgrade by providing new spec
>  * trigger upgrade of each instance/component
>  
> We need to add the ability to upgrade the service in one shot:
>  # Aborting the upgrade will not be supported
>  # Upgrade finalization will be done automatically.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

89 matches

Mail list logo