[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
[ https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-8690: -- Description: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color}{color:#d04437} authority += ":" + url.getPort();{color}{color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException \{ return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| was: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color} {color:#d04437} authority += ":" + url.getPort();{color} {color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException { return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| > Currently path not consistent in LocalResourceRequest to yarn 2.7 > - > > Key: YARN-8690 > URL: https://issues.apache.org/jira/browse/YARN-8690 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.1 >Reporter: jialei weng >Assignee: Wangda Tan >Priority: Major > > With YARN-1953 change, in yarn 2.9.1, we can not use path, like > hdfs://hostname/path, to local resource allocation as it will be resolved to > hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in > path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. > It isn't a consistent change. If we can make it consistent without > customer's change? [~leftnoteasy] > |Handle resource location path in 2.7|Handle resource location logic in 2.9| > | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { > String scheme = url.getScheme() == null ? "" : url.getScheme(); > > String authority = ""; > if (url.getHost() != null) { > authority = url.getHost(); > if (url.getUserInfo() != null) { > authority = url.getUserInfo() + "@" + authority; > } > {color:#d04437} if (url.getPort() > 0) {{color}{color:#d04437} > authority += ":" + url.getPort();{color}{color:#d04437} }{color} } > > return new Path( > (new URI(scheme, authority, url.getFile(), null, null)).normalize()); > }| public Path toPath() throws URISyntaxException \{ return new > Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), > getFile(), null, null)); }| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
[ https://issues.apache.org/jira/browse/YARN-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jialei weng updated YARN-8690: -- Description: With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] |Handle resource location path in 2.7|Handle resource location logic in 2.9| | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { String scheme = url.getScheme() == null ? "" : url.getScheme(); String authority = ""; if (url.getHost() != null) { authority = url.getHost(); if (url.getUserInfo() != null) { authority = url.getUserInfo() + "@" + authority; } {color:#d04437} if (url.getPort() > 0) {{color} {color:#d04437} authority += ":" + url.getPort();{color} {color:#d04437} }{color} } return new Path( (new URI(scheme, authority, url.getFile(), null, null)).normalize()); }| public Path toPath() throws URISyntaxException { return new Path(new URI(getScheme(), getUserInfo(), getHost(), getPort(), getFile(), null, null)); }| was:With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] > Currently path not consistent in LocalResourceRequest to yarn 2.7 > - > > Key: YARN-8690 > URL: https://issues.apache.org/jira/browse/YARN-8690 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.1 >Reporter: jialei weng >Assignee: Wangda Tan >Priority: Major > > With YARN-1953 change, in yarn 2.9.1, we can not use path, like > hdfs://hostname/path, to local resource allocation as it will be resolved to > hdfs://hostname{color:#ff}:0{color}/path. We have to add the port 443 in > path, like hdfs://hostname{color:#ff}:443{color}/path, to make it work. > It isn't a consistent change. If we can make it consistent without > customer's change? [~leftnoteasy] > |Handle resource location path in 2.7|Handle resource location logic in 2.9| > | public static Path getPathFromYarnURL(URL url) throws URISyntaxException { > String scheme = url.getScheme() == null ? "" : url.getScheme(); > > String authority = ""; > if (url.getHost() != null) { > authority = url.getHost(); > if (url.getUserInfo() != null) { > authority = url.getUserInfo() + "@" + authority; > } > {color:#d04437} if (url.getPort() > 0) {{color} > {color:#d04437} authority += ":" + url.getPort();{color} > {color:#d04437} }{color} > } > > return new Path( > (new URI(scheme, authority, url.getFile(), null, null)).normalize()); > }| public Path toPath() throws URISyntaxException { > return new Path(new URI(getScheme(), getUserInfo(), > getHost(), getPort(), getFile(), null, null)); > }| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8690) Currently path not consistent in LocalResourceRequest to yarn 2.7
jialei weng created YARN-8690: - Summary: Currently path not consistent in LocalResourceRequest to yarn 2.7 Key: YARN-8690 URL: https://issues.apache.org/jira/browse/YARN-8690 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.9.1 Reporter: jialei weng Assignee: Wangda Tan With YARN-1953 change, in yarn 2.9.1, we can not use path, like hdfs://hostname/path, to local resource allocation as it will be resolved to hdfs://hostname{color:#FF}:0{color}/path. We have to add the port 443 in path, like hdfs://hostname{color:#FF}:443{color}/path, to make it work. It isn't a consistent change. If we can make it consistent without customer's change? [~leftnoteasy] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586902#comment-16586902 ] Tao Yang commented on YARN-8685: Hi [~cheersyang], it's better if we can get these from NM containers rest API, but NM only know the running containers, we have to get other containers which are in ALLOCATED/ACQUIRED state from RM. There is a lot of difference for ContainerInfo between RM and NM, just like AppInfo. Thoughts? > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8685.001.patch > > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586854#comment-16586854 ] Wangda Tan edited comment on YARN-8513 at 8/21/18 3:37 AM: --- Interesting, [~cheersyang], I can only think about reservation allocation causes the issue, but given we already have logic below, it should not happen: {code} // And it should not be a reserved container if (assignment.getAssignmentInformation().getNumReservations() > 0) { return false; } {code} We should be able to see what kind of allocation causes the issue, or is it possible that CSAssignment indicate allocation happens but actually it doesn't. What is the {{maximum-container-assignments}} settings now? was (Author: leftnoteasy): Interesting, [~cheersyang], I can only think about reservation allocation causes the issue: {code} // And it should not be a reserved container if (assignment.getAssignmentInformation().getNumReservations() > 0) { return false; } {code} We should be able to see what kind of allocation causes the issue, or is it possible that CSAssignment indicate allocation happens but actually it doesn't. What is the {{maximum-container-assignments}} settings now? > CapacityScheduler infinite loop when queue is near fully utilized > - > > Key: YARN-8513 > URL: https://issues.apache.org/jira/browse/YARN-8513 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 3.1.0, 2.9.1 > Environment: Ubuntu 14.04.5 and 16.04.4 > YARN is configured with one label and 5 queues. >Reporter: Chen Yufei >Priority: Major > Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, > jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, > yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, > yarn3-resourcemanager.log, yarn3-top > > > ResourceManager does not respond to any request when queue is near fully > utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM > restart, it can recover running jobs and start accepting new ones. > > Seems like CapacityScheduler is in an infinite loop printing out the > following log messages (more than 25,000 lines in a second): > > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > assignedContainer queue=root usedCapacity=0.99816763 > absoluteUsedCapacity=0.99816763 used= > cluster=}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: > assignedContainer application attempt=appattempt_1530619767030_1652_01 > container=null > queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943 > clusterResource= type=NODE_LOCAL > requestedPartition=}} > > I encounter this problem several times after upgrading to YARN 2.9.1, while > the same configuration works fine under version 2.7.3. > > YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a > similar problem. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-2497) Fair scheduler should support strict node labels
[ https://issues.apache.org/jira/browse/YARN-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586856#comment-16586856 ] genericqa commented on YARN-2497: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} YARN-2497 does not apply to branch-3.0. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-2497 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12895017/YARN-2497.branch-3.0.001.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21642/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Fair scheduler should support strict node labels > > > Key: YARN-2497 > URL: https://issues.apache.org/jira/browse/YARN-2497 > Project: Hadoop YARN > Issue Type: Sub-task > Components: fairscheduler >Reporter: Wangda Tan >Assignee: Daniel Templeton >Priority: Major > Attachments: YARN-2497.001.patch, YARN-2497.002.patch, > YARN-2497.003.patch, YARN-2497.004.patch, YARN-2497.005.patch, > YARN-2497.006.patch, YARN-2497.007.patch, YARN-2497.008.patch, > YARN-2497.009.patch, YARN-2497.010.patch, YARN-2497.011.patch, > YARN-2497.branch-3.0.001.patch, YARN-2499.WIP01.patch > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586854#comment-16586854 ] Wangda Tan commented on YARN-8513: -- Interesting, [~cheersyang], I can only think about reservation allocation causes the issue: {code} // And it should not be a reserved container if (assignment.getAssignmentInformation().getNumReservations() > 0) { return false; } {code} We should be able to see what kind of allocation causes the issue, or is it possible that CSAssignment indicate allocation happens but actually it doesn't. What is the {{maximum-container-assignments}} settings now? > CapacityScheduler infinite loop when queue is near fully utilized > - > > Key: YARN-8513 > URL: https://issues.apache.org/jira/browse/YARN-8513 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 3.1.0, 2.9.1 > Environment: Ubuntu 14.04.5 and 16.04.4 > YARN is configured with one label and 5 queues. >Reporter: Chen Yufei >Priority: Major > Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, > jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, > yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, > yarn3-resourcemanager.log, yarn3-top > > > ResourceManager does not respond to any request when queue is near fully > utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM > restart, it can recover running jobs and start accepting new ones. > > Seems like CapacityScheduler is in an infinite loop printing out the > following log messages (more than 25,000 lines in a second): > > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > assignedContainer queue=root usedCapacity=0.99816763 > absoluteUsedCapacity=0.99816763 used= > cluster=}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: > assignedContainer application attempt=appattempt_1530619767030_1652_01 > container=null > queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943 > clusterResource= type=NODE_LOCAL > requestedPartition=}} > > I encounter this problem several times after upgrading to YARN 2.9.1, while > the same configuration works fine under version 2.7.3. > > YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a > similar problem. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8689) Support partition fairness in fair scheduler strict node label
zhuqi created YARN-8689: --- Summary: Support partition fairness in fair scheduler strict node label Key: YARN-8689 URL: https://issues.apache.org/jira/browse/YARN-8689 Project: Hadoop YARN Issue Type: Task Components: fairscheduler Reporter: zhuqi -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586811#comment-16586811 ] Weiwei Yang commented on YARN-8513: --- Discussed with [~cyfdecyf] in the slack channel, it looks like this issue was caused by the greedy container assignments per HB mechanism, for some reason, it never stop trying assign new containers for this particular node in a while loop. I suggested to add following config to work-around {noformat} “yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enable”=“true” “yarn.scheduler.capacity.per-node-heartbeat.maximum-container-assignments”=“10” {noformat} At the mean time, [~cyfdecyf] is trying to apply this config changes to their cluster to see if that helps, and I am trying to reproduce this issue locally. > CapacityScheduler infinite loop when queue is near fully utilized > - > > Key: YARN-8513 > URL: https://issues.apache.org/jira/browse/YARN-8513 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 3.1.0, 2.9.1 > Environment: Ubuntu 14.04.5 and 16.04.4 > YARN is configured with one label and 5 queues. >Reporter: Chen Yufei >Priority: Major > Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, > jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, > yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, > yarn3-resourcemanager.log, yarn3-top > > > ResourceManager does not respond to any request when queue is near fully > utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM > restart, it can recover running jobs and start accepting new ones. > > Seems like CapacityScheduler is in an infinite loop printing out the > following log messages (more than 25,000 lines in a second): > > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > assignedContainer queue=root usedCapacity=0.99816763 > absoluteUsedCapacity=0.99816763 used= > cluster=}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: > assignedContainer application attempt=appattempt_1530619767030_1652_01 > container=null > queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943 > clusterResource= type=NODE_LOCAL > requestedPartition=}} > > I encounter this problem several times after upgrading to YARN 2.9.1, while > the same configuration works fine under version 2.7.3. > > YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a > similar problem. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8688) Duplicate queue names in fair scheduler allocation file
Shen Yinjie created YARN-8688: - Summary: Duplicate queue names in fair scheduler allocation file Key: YARN-8688 URL: https://issues.apache.org/jira/browse/YARN-8688 Project: Hadoop YARN Issue Type: Improvement Affects Versions: 3.1.0, 2.8.2 Reporter: Shen Yinjie when config++ duplicate queue names in fair scheduler allocation file, RM cannot recognized the error even if restart RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8569) Create an interface to provide cluster information to application
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586769#comment-16586769 ] Eric Yang commented on YARN-8569: - First patch for demo what the interface looks like. Here is my testing json for launching the app: {code} { "name": "sleeper-service", "version": "1.0", "components" : [ { "name": "ping", "number_of_containers": 2, "artifact": { "id": "hadoop/centos:latest", "type": "DOCKER" }, "launch_command": "sleep,1", "resource": { "cpus": 1, "memory": "256" }, "restart_policy": "NEVER", "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL":"true", "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true", "YARN_CONTAINER_RUNTIME_YARN_SYSFS":"true" }, "properties": { "docker.network": "host" } } } ] } {code} The patch will localize a copy of service.json in HDFS, and distribute it as a localized resource to nodes that are going to start the container. The other copy of [appname].json is not used because the content of the file changes too frequently that triggers IOException while localizing. Let me know if this is the direction that we want to continue. If all looks good, I will provide live update to this file when application completes transition between STARTED, FLEXING or STABLE, etc. > Create an interface to provide cluster information to application > - > > Key: YARN-8569 > URL: https://issues.apache.org/jira/browse/YARN-8569 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8569.001.patch > > > Some program requires container hostnames to be known for application to run. > For example, distributed tensorflow requires launch_command that looks like: > {code} > # On ps0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=0 > # On ps1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=1 > # On worker0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=0 > # On worker1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=1 > {code} > This is a bit cumbersome to orchestrate via Distributed Shell, or YARN > services launch_command. In addition, the dynamic parameters do not work > with YARN flex command. This is the classic pain point for application > developer attempt to automate system environment settings as parameter to end > user application. > It would be great if YARN Docker integration can provide a simple option to > expose hostnames of the yarn service via a mounted file. The file content > gets updated when flex command is performed. This allows application > developer to consume system environment settings via a standard interface. > It is like /proc/devices for Linux, but for Hadoop. This may involve > updating a file in distributed cache, and allow mounting of the file via > container-executor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8569) Create an interface to provide cluster information to application
[ https://issues.apache.org/jira/browse/YARN-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8569: Attachment: YARN-8569.001.patch > Create an interface to provide cluster information to application > - > > Key: YARN-8569 > URL: https://issues.apache.org/jira/browse/YARN-8569 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Eric Yang >Assignee: Eric Yang >Priority: Major > Labels: Docker > Attachments: YARN-8569.001.patch > > > Some program requires container hostnames to be known for application to run. > For example, distributed tensorflow requires launch_command that looks like: > {code} > # On ps0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=0 > # On ps1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=ps --task_index=1 > # On worker0.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=0 > # On worker1.example.com: > $ python trainer.py \ > --ps_hosts=ps0.example.com:,ps1.example.com: \ > --worker_hosts=worker0.example.com:,worker1.example.com: \ > --job_name=worker --task_index=1 > {code} > This is a bit cumbersome to orchestrate via Distributed Shell, or YARN > services launch_command. In addition, the dynamic parameters do not work > with YARN flex command. This is the classic pain point for application > developer attempt to automate system environment settings as parameter to end > user application. > It would be great if YARN Docker integration can provide a simple option to > expose hostnames of the yarn service via a mounted file. The file content > gets updated when flex command is performed. This allows application > developer to consume system environment settings via a standard interface. > It is like /proc/devices for Linux, but for Hadoop. This may involve > updating a file in distributed cache, and allow mounting of the file via > container-executor. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent
[ https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586721#comment-16586721 ] genericqa commented on YARN-8509: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 39s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 27s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 52s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 13 new + 949 unchanged - 5 fixed = 962 total (was 954) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 43s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m 6s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 7s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimits | | | hadoop.yarn.server.resourcemanager.monitor.capacity.TestProportionalCapacityPreemptionPolicy | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestApplicationLimitsByPartition | | | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestNodeLabelContainerAllocation | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8509 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936340/YARN-8509.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux c057753f2232 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586667#comment-16586667 ] genericqa commented on YARN-8298: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 7s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 34s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 22s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 6 new + 387 unchanged - 2 fixed = 393 total (was 389) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 34s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 32s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 25m 13s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 53s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 56s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 28s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8298 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936332/YARN-8298.005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs
[jira] [Commented] (YARN-3611) Support Docker Containers In LinuxContainerExecutor
[ https://issues.apache.org/jira/browse/YARN-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586630#comment-16586630 ] Shane Kumpf commented on YARN-3611: --- [~zhouyunfan] - thank you for your interest! Please see the [YARN containerization docs|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md] as a starting point. If you have specific questions after that, please do reach out on the hadoop-user mailing list. > Support Docker Containers In LinuxContainerExecutor > --- > > Key: YARN-3611 > URL: https://issues.apache.org/jira/browse/YARN-3611 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana >Priority: Major > Labels: Docker > > Support Docker Containers In LinuxContainerExecutor > LinuxContainerExecutor provides useful functionality today with respect to > localization, cgroups based resource management and isolation for CPU, > network, disk etc. as well as security with a well-defined mechanism to > execute privileged operations using the container-executor utility. Bringing > docker support to LinuxContainerExecutor lets us use all of this > functionality when running docker containers under YARN, while not requiring > users and admins to configure and use a different ContainerExecutor. > There are several aspects here that need to be worked through : > * Mechanism(s) to let clients request docker-specific functionality - we > could initially implement this via environment variables without impacting > the client API. > * Security - both docker daemon as well as application > * Docker image localization > * Running a docker container via container-executor as a specified user > * “Isolate” the docker container in terms of CPU/network/disk/etc > * Communicating with and/or signaling the running container (ensure correct > pid handling) > * Figure out workarounds for certain performance-sensitive scenarios like > HDFS short-circuit reads > * All of these need to be achieved without changing the current behavior of > LinuxContainerExecutor -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8623) Update Docker examples to use image which exists
[ https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586592#comment-16586592 ] Shane Kumpf edited comment on YARN-8623 at 8/20/18 10:34 PM: - [~elek] - thanks, those details are helpful. It does appear _apache/hadoop-runner_ is closer to what we want than I originally thought, but the user setup clashes with our needs. With a goal of trying to provide a working MR pi example, MapReduce expects to run (and write data) as the end user (or a static local user, such as nobody, depending on config), so we need to propagate the user identity into the container. I expect Spark needs this as well. Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} in the starter script via an env variable, or opening up the sudo rules would all seem to work to allow us to use this for YARN as well. I think we should open a separate HADOOP Jira to discuss making the image work for both cases if that makes sense to others. [~elek] [~ccondit-target] thoughts? was (Author: shaneku...@gmail.com): [~elek] - thanks, those details are helpful. It does appear _apache/hadoop-runner_ is closer to what we want than I originally thought, but the user setup clashes with our needs. With a goal of trying to provide a working MR pi example, MapReduce expects to run (and write data) as the end user (or a static local user, such as nobody, depending on config). I expect Spark does as well. Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} in the starter script via an env variable, or opening up the sudo rules would all seem to work to allow us to use this for YARN as well. I think we should open a separate HADOOP Jira to discuss making the image work for both cases if that makes sense to others. [~elek] [~ccondit-target] thoughts? > Update Docker examples to use image which exists > > > Key: YARN-8623 > URL: https://issues.apache.org/jira/browse/YARN-8623 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Craig Condit >Priority: Minor > Labels: Docker > > The example Docker image given in the documentation > (images/hadoop-docker:latest) does not exist. We could change > images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. > We'd need to do a quick sanity test to see if the image works with YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8623) Update Docker examples to use image which exists
[ https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586601#comment-16586601 ] Shane Kumpf commented on YARN-8623: --- I was able to run the below MR pi job with the modified _apache/hadoop-runner_ image, after a quick hack to the sudo rules. {code:java} YARN_EXAMPLES_JAR=$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar IMAGE_ID="local/hadoop-runner-new:latest" MOUNTS="/usr/local/hadoop:/usr/local/hadoop:ro,/etc/hadoop/conf:/etc/hadoop/conf:ro,/etc/passwd:/etc/passwd:ro,/etc/group:/etc/group:ro" yarn jar $YARN_EXAMPLES_JAR pi \ -Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_TYPE=docker \ -Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS \ -Dmapreduce.map.env.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID \ -Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_TYPE=docker \ -Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_DOCKER_MOUNTS=$MOUNTS \ -Dmapreduce.reduce.env.YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=$IMAGE_ID 1 4{code} Hadoop bits were installed to {{/usr/local/hadoop}} on the host. Hadoop config in {{/etc/hadoop/conf}} on the host. The appropriate mounts were added to {{docker.allowed.ro-mounts}} and the image prefix to {{docker.trusted.registries}} in {{container-executor.cfg}} The above assumes the use of {{/etc/passwd}} and {{/etc/group}} for propagating the user and group into the container. We should point to the other ways of [managing user propagation|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/markdown/DockerContainers.md#user-management-in-docker-container] as part of this example documentation. > Update Docker examples to use image which exists > > > Key: YARN-8623 > URL: https://issues.apache.org/jira/browse/YARN-8623 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Craig Condit >Priority: Minor > Labels: Docker > > The example Docker image given in the documentation > (images/hadoop-docker:latest) does not exist. We could change > images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. > We'd need to do a quick sanity test to see if the image works with YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8509) Total pending resource calculation in preemption should use user-limit factor instead of minimum-user-limit-percent
[ https://issues.apache.org/jira/browse/YARN-8509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zian Chen updated YARN-8509: Attachment: YARN-8509.004.patch > Total pending resource calculation in preemption should use user-limit factor > instead of minimum-user-limit-percent > --- > > Key: YARN-8509 > URL: https://issues.apache.org/jira/browse/YARN-8509 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Reporter: Zian Chen >Assignee: Zian Chen >Priority: Major > Labels: capacityscheduler > Attachments: YARN-8509.001.patch, YARN-8509.002.patch, > YARN-8509.003.patch, YARN-8509.004.patch > > > In LeafQueue#getTotalPendingResourcesConsideringUserLimit, we calculate total > pending resource based on user-limit percent and user-limit factor which will > cap pending resource for each user to the minimum of user-limit pending and > actual pending. This will prevent queue from taking more pending resource to > achieve queue balance after all queue satisfied with its ideal allocation. > > We need to change the logic to let queue pending can go beyond userlimit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8623) Update Docker examples to use image which exists
[ https://issues.apache.org/jira/browse/YARN-8623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586592#comment-16586592 ] Shane Kumpf commented on YARN-8623: --- [~elek] - thanks, those details are helpful. It does appear _apache/hadoop-runner_ is closer to what we want than I originally thought, but the user setup clashes with our needs. With a goal of trying to provide a working MR pi example, MapReduce expects to run (and write data) as the end user (or a static local user, such as nobody, depending on config). I expect Spark does as well. Removing the use of sudo in the entrypoint script, gating that {{sudo chmod}} in the starter script via an env variable, or opening up the sudo rules would all seem to work to allow us to use this for YARN as well. I think we should open a separate HADOOP Jira to discuss making the image work for both cases if that makes sense to others. [~elek] [~ccondit-target] thoughts? > Update Docker examples to use image which exists > > > Key: YARN-8623 > URL: https://issues.apache.org/jira/browse/YARN-8623 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Craig Condit >Priority: Minor > Labels: Docker > > The example Docker image given in the documentation > (images/hadoop-docker:latest) does not exist. We could change > images/hadoop-docker:latest to apache/hadoop-runner:latest, which does exist. > We'd need to do a quick sanity test to see if the image works with YARN. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586591#comment-16586591 ] genericqa commented on YARN-8673: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 45s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 1s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 54s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 3s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 16s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 24s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 15m 8s{color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 52s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}126m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-8673 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936319/YARN-8673-branch-2.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 3892fdc4613e 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 18ebe18 | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21638/testReport/ | | Max. process+thread count | 788 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common
[jira] [Updated] (YARN-8599) Build Master module for MaWo app
[ https://issues.apache.org/jira/browse/YARN-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated YARN-8599: - Attachment: YARN-8599.001.patch > Build Master module for MaWo app > > > Key: YARN-8599 > URL: https://issues.apache.org/jira/browse/YARN-8599 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Yesha Vora >Assignee: Yesha Vora >Priority: Major > Attachments: YARN-8599.001.patch > > > Master component for MaWo application is responsible for driving end-to-end > job execution. Its responsibility is > * Get Job definition and create a Queue of Tasks > * Assign Tasks to Worker > * Manage Workers lifecycle -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8598) Build Master Job Module for MaWo Application
[ https://issues.apache.org/jira/browse/YARN-8598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated YARN-8598: - Attachment: YARN-8598.001.patch > Build Master Job Module for MaWo Application > > > Key: YARN-8598 > URL: https://issues.apache.org/jira/browse/YARN-8598 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Yesha Vora >Assignee: Yesha Vora >Priority: Major > Attachments: YARN-8598.001.patch > > > A job in MaWo application is a collection of Tasks. A Job consists of a setup > task, a list of tasks and a teardown task. > * JobBuilder > ** SimpleTaskJobBuilder : SimpleJobBuilder should be able to parse > simpleJobdescription file. In this file format, each line is considered as > Task > ** SimpleTaskJsonJobBuilder: Utility to parse json job description file. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8597) Build Worker utility for MaWo Application
[ https://issues.apache.org/jira/browse/YARN-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated YARN-8597: - Attachment: YARN-8597.001.patch > Build Worker utility for MaWo Application > - > > Key: YARN-8597 > URL: https://issues.apache.org/jira/browse/YARN-8597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Yesha Vora >Assignee: Yesha Vora >Priority: Major > Attachments: YARN-8597.001.patch > > > The worker is responsible for executing Tasks. > * Worker > ** Create a worker class which drives worker life cycle > ** Create WorkAssignment Protocol. It should be handle Register/deregister > worker, send heartbeat > ** Lifecycle: Register worker, Run Setup Task, Get Task from master and > execute it using TaskRunner, Run Teardown Task > * TaskRunner > ** Simple Task Runner : This runner should be able to execute a simple task > ** Composite Task Runner: This runner should be able to execute composite > task > * TaskWallTimeLimiter > ** Create a utility which can abort the task if the execution time exceeds > task timeout. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8597) Build Worker utility for MaWo Application
[ https://issues.apache.org/jira/browse/YARN-8597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yesha Vora updated YARN-8597: - Attachment: (was: YARN-8597.001.patch) > Build Worker utility for MaWo Application > - > > Key: YARN-8597 > URL: https://issues.apache.org/jira/browse/YARN-8597 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Yesha Vora >Assignee: Yesha Vora >Priority: Major > Attachments: YARN-8597.001.patch > > > The worker is responsible for executing Tasks. > * Worker > ** Create a worker class which drives worker life cycle > ** Create WorkAssignment Protocol. It should be handle Register/deregister > worker, send heartbeat > ** Lifecycle: Register worker, Run Setup Task, Get Task from master and > execute it using TaskRunner, Run Teardown Task > * TaskRunner > ** Simple Task Runner : This runner should be able to execute a simple task > ** Composite Task Runner: This runner should be able to execute composite > task > * TaskWallTimeLimiter > ** Create a utility which can abort the task if the execution time exceeds > task timeout. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586549#comment-16586549 ] Giovanni Matteo Fumarola commented on YARN-8581: Thanks [~botong] . Committed to trunk. > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > --- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, > YARN-8581.v2.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586548#comment-16586548 ] Hudson commented on YARN-8581: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #14806 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14806/]) YARN-8581. [AMRMProxy] Add sub-cluster timeout in (gifuma: rev e0f6ffdbad6f43fd43ec57fb68ebf5275b8b9ba0) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/test/java/org/apache/hadoop/yarn/conf/TestYarnConfigurationFields.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/utils/FederationStateStoreFacade.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/utils/FederationPoliciesTestUtil.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/amrmproxy/TestLocalityMulticastAMRMProxyPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/policies/amrmproxy/LocalityMulticastAMRMProxyPolicy.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > --- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, > YARN-8581.v2.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8298: Attachment: YARN-8298.005.patch > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch, YARN-8298.004.patch, YARN-8298.005.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586541#comment-16586541 ] genericqa commented on YARN-8298: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 40s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 10m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 53s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 49s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 21s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 6 new + 387 unchanged - 2 fixed = 393 total (was 389) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 26s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 24m 51s{color} | {color:green} hadoop-yarn-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 12m 48s{color} | {color:green} hadoop-yarn-services-core in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 49s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}127m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8298 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936309/YARN-8298.004.patch | | Optional Tests | asflicense
[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586539#comment-16586539 ] genericqa commented on YARN-8581: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 46s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 30s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 48s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 43s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 54s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}101m 45s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8581 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936314/YARN-8581.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 8c46de123fc4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 8736fc3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21636/testReport/ | | Max. process+thread count | 301 (vs. ulimit of 1) | | modules | C:
[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586493#comment-16586493 ] genericqa commented on YARN-8581: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 15m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 41s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 8s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} branch-2 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 28s{color} | {color:green} hadoop-yarn-server-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 57m 40s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:a716388 | | JIRA Issue | YARN-8581 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936315/YARN-8581-branch-2.v2.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 503aaef576f0 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 18ebe18 | | maven | version: Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00) | | Default Java | 1.7.0_181 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21637/testReport/ | | Max. process+thread count | 86 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21637/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [AMRMProxy] Add sub-cluster timeout in
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586458#comment-16586458 ] Chandni Singh commented on YARN-8298: - {quote} We have built sufficient number of knobs for individual container to upgrade in a rolling fashion. However, it will depend on external orchestrator to perform the rolling upgrade. Express upgrade is design to be atomic, therefore, it simplifies upgrade process by doing all instances of a component in parallel. Docker container takes only a few second to stop and start, therefore, the interruption time is minimized to few seconds {quote} [~eyang] When an express upgrade is performed, I am of the opinion that the upgrade of a single component should be done in a rolling fashion otherwise if there is a failure the service is disrupted. If we provide express upgrade that should be the default behavior. If upgrade of an instance fails, other instances of the component should not be tried to be upgraded. Docker container may take a few second to stop and start but the other instances of the component will be active. Besides that, with the 2nd approach, I meant that the scheduler should not do any sort of orchestration including upgrading instances of a particular component before another. This is blocked by YARN-8665 as it needs support for cancelling an upgrade in case of failure. Given that, if you want to go with the 2nd approach, then patch 4 contains all the changes. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch, YARN-8298.004.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8673: --- Attachment: YARN-8673-branch-2.v2.patch > [AMRMProxy] More robust responseId resync after an YarnRM master slave switch > - > > Key: YARN-8673 > URL: https://issues.apache.org/jira/browse/YARN-8673 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8673-branch-2.v2.patch, YARN-8673.v1.patch, > YARN-8673.v2.patch > > > After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ > will be thrown from the new YarnRM. AM will re-regsiter and reset the > responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ > follows the same protocol, and does the automatic re-register and responseId > resync. However, when exceptions or temporary network issue happens in the > allocate call after re-register, the resync logic might be broken. This patch > improves the robustness of the process by parsing the expected repsonseId > from YarnRM exception message. So that whenever the responseId is out of sync > for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586442#comment-16586442 ] Eric Yang commented on YARN-8298: - [~csingh] Rolling upgrade on top of express upgrade is novelty idea, but most of the uninterrupted logic needs to come from software that hosted in the container rather than the upgrade framework itself. It would be good to proceed with option 2. We have built sufficient number of knobs for individual container to upgrade in a rolling fashion. However, it will depend on external orchestrator to perform the rolling upgrade. Express upgrade is design to be atomic, therefore, it simplifies upgrade process by doing all instances of a component in parallel. Docker container takes only a few second to stop and start, therefore, the interruption time is minimized to few seconds. By having both features built, user can choose one or the other. This approach matches perfectly with Ambari definition of rolling upgrade, and express upgrade respectively. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch, YARN-8298.004.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586410#comment-16586410 ] Hudson commented on YARN-8673: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14805 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14805/]) YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM (gifuma: rev 8736fc39ac3b3de168d2c216f3d1c0edb48fb3f9) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/AMRMClientUtils.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/AMRMClientRelayer.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedApplicationManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/TestAMRMClientRelayer.java > [AMRMProxy] More robust responseId resync after an YarnRM master slave switch > - > > Key: YARN-8673 > URL: https://issues.apache.org/jira/browse/YARN-8673 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch > > > After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ > will be thrown from the new YarnRM. AM will re-regsiter and reset the > responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ > follows the same protocol, and does the automatic re-register and responseId > resync. However, when exceptions or temporary network issue happens in the > allocate call after re-register, the resync logic might be broken. This patch > improves the robustness of the process by parsing the expected repsonseId > from YarnRM exception message. So that whenever the responseId is out of sync > for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8581: --- Attachment: YARN-8581-branch-2.v2.patch > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > --- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8581-branch-2.v2.patch, YARN-8581.v1.patch, > YARN-8581.v2.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Botong Huang updated YARN-8581: --- Attachment: YARN-8581.v2.patch > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > --- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8581.v1.patch, YARN-8581.v2.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586390#comment-16586390 ] Giovanni Matteo Fumarola edited comment on YARN-8673 at 8/20/18 7:24 PM: - LGTM +1. Committed to Trunk. Thanks [~botong] . was (Author: giovanni.fumarola): LGTM +1. Committed to Trunk. > [AMRMProxy] More robust responseId resync after an YarnRM master slave switch > - > > Key: YARN-8673 > URL: https://issues.apache.org/jira/browse/YARN-8673 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch > > > After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ > will be thrown from the new YarnRM. AM will re-regsiter and reset the > responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ > follows the same protocol, and does the automatic re-register and responseId > resync. However, when exceptions or temporary network issue happens in the > allocate call after re-register, the resync logic might be broken. This patch > improves the robustness of the process by parsing the expected repsonseId > from YarnRM exception message. So that whenever the responseId is out of sync > for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8673) [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
[ https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586390#comment-16586390 ] Giovanni Matteo Fumarola commented on YARN-8673: LGTM +1. Committed to Trunk. > [AMRMProxy] More robust responseId resync after an YarnRM master slave switch > - > > Key: YARN-8673 > URL: https://issues.apache.org/jira/browse/YARN-8673 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch > > > After master slave switch of YarnRM, an _ApplicationNotRegisteredException_ > will be thrown from the new YarnRM. AM will re-regsiter and reset the > responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ > follows the same protocol, and does the automatic re-register and responseId > resync. However, when exceptions or temporary network issue happens in the > allocate call after re-register, the resync logic might be broken. This patch > improves the robustness of the process by parsing the expected repsonseId > from YarnRM exception message. So that whenever the responseId is out of sync > for whatever reason, we can automatically resync and move on. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8581) [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy
[ https://issues.apache.org/jira/browse/YARN-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586384#comment-16586384 ] Giovanni Matteo Fumarola commented on YARN-8581: LGTM +1. Do you mind rebase? Hunk #3 FAILED at 145. 1 out of 3 hunks FAILED -- saving rejects to file hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/utils/FederationPoliciesTestUtil.java.rej > [AMRMProxy] Add sub-cluster timeout in LocalityMulticastAMRMProxyPolicy > --- > > Key: YARN-8581 > URL: https://issues.apache.org/jira/browse/YARN-8581 > Project: Hadoop YARN > Issue Type: Sub-task > Components: amrmproxy, federation >Reporter: Botong Huang >Assignee: Botong Huang >Priority: Major > Attachments: YARN-8581.v1.patch > > > In Federation, every time an AM heartbeat comes in, > LocalityMulticastAMRMProxyPolicy in AMRMProxy splits the asks according to > the list of active and enabled sub-clusters. However, if we haven't been able > to heartbeat to a sub-cluster for some time (network issues, or we keep > hitting some exception from YarnRM, or YarnRM master-slave switch is taking a > long time etc.), we should consider the sub-cluster as unhealthy and stop > routing asks there, until the heartbeat channel becomes healthy again. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8298: Attachment: YARN-8298.004.patch > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch, YARN-8298.004.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586347#comment-16586347 ] Chandni Singh edited comment on YARN-8298 at 8/20/18 6:29 PM: -- [~eyang] Supporting orchestration of upgrade by ServiceMaster requires much more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665 There is another major problem with express upgrade. The instances of a single component should also be done one by one (rolling fashion). Otherwise, upgrade can cause disruption in component availability. One of the main reasons behind upgrade is that service is not disrupted. However if we upgrade all the instances in parallel, then failure in upgrade causes disruption. There are 2 ways to proceed: 1. Support canceling upgrade first https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira 2. We merge this way of express upgrade where all instances are upgraded in parallel. It is a convenient way for dev testing. Work on YARN-8665 and then modify express upgrade. I am fine with either way. was (Author: csingh): [~eyang] Supporting orchestration of upgrade by ServiceMaster requires much more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665 There is another major problem with express upgrade. The instances of a single component should also be done one by one (rolling fashion). Otherwise, upgrade can cause disruption in component availability. One of the main reasons behind upgrade is that service is not disrupted. However if we upgrade all the instances in parallel, then failure in upgrade causes disruption. There are 2 ways to proceed: 1. Support canceling upgrade first https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira 2. We merge this way of express upgrade where all instances are upgraded in parallel. It is a convenient way for dev testing. I am fine with either way. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586347#comment-16586347 ] Chandni Singh commented on YARN-8298: - [~eyang] Supporting orchestration of upgrade by ServiceMaster requires much more work and is blocked by https://issues.apache.org/jira/browse/YARN-8665 There is another major problem with express upgrade. The instances of a single component should also be done one by one (rolling fashion). Otherwise, upgrade can cause disruption in component availability. One of the main reasons behind upgrade is that service is not disrupted. However if we upgrade all the instances in parallel, then failure in upgrade causes disruption. There are 2 ways to proceed: 1. Support canceling upgrade first https://issues.apache.org/jira/browse/YARN-8665 and then re-work this jira 2. We merge this way of express upgrade where all instances are upgraded in parallel. It is a convenient way for dev testing. I am fine with either way. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8632) No data in file realtimetrack.json after running SchedulerLoadSimulator
[ https://issues.apache.org/jira/browse/YARN-8632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586344#comment-16586344 ] Yufei Gu commented on YARN-8632: For that sake, we need to "setUncaughtExceptionHandler" for the thread, and provide a handler. Catching every exception in {{run()}} isn't enough. > No data in file realtimetrack.json after running SchedulerLoadSimulator > --- > > Key: YARN-8632 > URL: https://issues.apache.org/jira/browse/YARN-8632 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler-load-simulator >Reporter: Xianghao Lu >Assignee: Xianghao Lu >Priority: Major > Attachments: YARN-8632-branch-2.7.2.001.patch, YARN-8632.001.patch, > YARN-8632.002.patch > > > Recently, I have been using > [SchedulerLoadSimulator|https://hadoop.apache.org/docs/r2.7.2/hadoop-sls/SchedulerLoadSimulator.html] > to validate the impact of changes on my FairScheduler. I encountered some > problems. > Firstly, I fix a npe bug with the patch in > https://issues.apache.org/jira/browse/YARN-4302 > Secondly, everything seems to be ok, but I just get "[]" in file > realtimetrack.json. Finally, I find the MetricsLogRunnable thread will exit > because of npe, > the reason is "wrapper.getQueueSet()" is still null when executing "String > metrics = web.generateRealTimeTrackingMetrics();" > So, we should put "String metrics = web.generateRealTimeTrackingMetrics();" > in try section to avoid MetricsLogRunnable thread exit with unexpected > exception. > My hadoop version is 2.7.2, it seems that hadoop trunk branch also has the > second problem and I have made a patch to solve it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586330#comment-16586330 ] Eric Yang commented on YARN-8298: - [~csingh] Patch 2 was working in progress. It only figured out the dependency order, and submit the upgrade request. The enhancement depends on ServiceScheduler to walk through the component list in a for loop and check upgrade status of each component before proceeding to the next component. This will also meet your thinking to introduce ability to cancel or abort an upgrade. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193 ] Chandni Singh edited comment on YARN-8298 at 8/20/18 6:02 PM: -- [~eyang] Please see {quote} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. {quote} The code which you pasted does not seem to upgrade component by component. 1. The below line finds all the containers that will be sent to AM for upgrades {code:java} containersToUpgrade = ServiceApiUtil .validateAndResolveCompsUpgrade(persistedService, components); {code} 2. The below request is sent to AM and the AM issues event to upgrade all the instances {code} CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = CompInstancesUpgradeRequestProto.newBuilder(); upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); {code} Instances are processing these events asynchronously, so all the instances will get upgraded without any order guarantees Having to upgrade component by component based on dependencies could only be supported if we have support for canceling the upgrade when there is any failure. Also in case the upgrade is cancelled, we need a way for the user to check the status of the upgrade. was (Author: csingh): [~eyang] Please see {quote} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. {quote} The code which you pasted does not seem to upgrade component by component. 1. The below line finds all the containers that will be sent to AM for upgrades {code:java} containersToUpgrade = ServiceApiUtil .validateAndResolveCompsUpgrade(persistedService, components); {code} 2. The below request is sent to AM and the AM issues event to upgrade all the instances {code} CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = CompInstancesUpgradeRequestProto.newBuilder(); upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); {code} Instances are processing these events asynchronously, so all the instances will get upgraded without any order guarantees Having to upgrade component by component could only be supported if we have support for canceling the upgrade when there is any failure. Also in case the upgrade is cancelled, we need a way for the user to check the status of the upgrade. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8687) YARN service example is out-dated
[ https://issues.apache.org/jira/browse/YARN-8687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Yang updated YARN-8687: Issue Type: Sub-task (was: Bug) Parent: YARN-7054 > YARN service example is out-dated > - > > Key: YARN-8687 > URL: https://issues.apache.org/jira/browse/YARN-8687 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn-native-services >Reporter: Eric Yang >Priority: Major > > Example for YARN service is using file type "ENV". > {code} > { > "name": "httpd-service", > "version": "1.0", > "lifetime": "3600", > "components": [ > { > "name": "httpd", > "number_of_containers": 2, > "artifact": { > "id": "centos/httpd-24-centos7:latest", > "type": "DOCKER" > }, > "launch_command": "/usr/bin/run-httpd", > "resource": { > "cpus": 1, > "memory": "1024" > }, > "configuration": { > "files": [ > { > "type": "TEMPLATE", > "dest_file": "/var/www/html/index.html", > "properties": { > "content": > "TitleHello from > ${COMPONENT_INSTANCE_NAME}!" > } > } > ] > } > }, > { > "name": "httpd-proxy", > "number_of_containers": 1, > "artifact": { > "id": "centos/httpd-24-centos7:latest", > "type": "DOCKER" > }, > "launch_command": "/usr/bin/run-httpd", > "resource": { > "cpus": 1, > "memory": "1024" > }, > "configuration": { > "files": [ > { > "type": "TEMPLATE", > "dest_file": "/etc/httpd/conf.d/httpd-proxy.conf", > "src_file": "httpd-proxy.conf" > } > ] > } > } > ], > "quicklinks": { > "Apache HTTP Server": > "http://httpd-proxy-0.${SERVICE_NAME}.${USER}.${DOMAIN}:8080; > } > } > {code} > The type has changed to "TEMPLATE" in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8687) YARN service example is out-dated
Eric Yang created YARN-8687: --- Summary: YARN service example is out-dated Key: YARN-8687 URL: https://issues.apache.org/jira/browse/YARN-8687 Project: Hadoop YARN Issue Type: Bug Components: yarn-native-services Reporter: Eric Yang Example for YARN service is using file type "ENV". {code} { "name": "httpd-service", "version": "1.0", "lifetime": "3600", "components": [ { "name": "httpd", "number_of_containers": 2, "artifact": { "id": "centos/httpd-24-centos7:latest", "type": "DOCKER" }, "launch_command": "/usr/bin/run-httpd", "resource": { "cpus": 1, "memory": "1024" }, "configuration": { "files": [ { "type": "TEMPLATE", "dest_file": "/var/www/html/index.html", "properties": { "content": "TitleHello from ${COMPONENT_INSTANCE_NAME}!" } } ] } }, { "name": "httpd-proxy", "number_of_containers": 1, "artifact": { "id": "centos/httpd-24-centos7:latest", "type": "DOCKER" }, "launch_command": "/usr/bin/run-httpd", "resource": { "cpus": 1, "memory": "1024" }, "configuration": { "files": [ { "type": "TEMPLATE", "dest_file": "/etc/httpd/conf.d/httpd-proxy.conf", "src_file": "httpd-proxy.conf" } ] } } ], "quicklinks": { "Apache HTTP Server": "http://httpd-proxy-0.${SERVICE_NAME}.${USER}.${DOMAIN}:8080; } } {code} The type has changed to "TEMPLATE" in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193 ] Chandni Singh edited comment on YARN-8298 at 8/20/18 4:45 PM: -- [~eyang] Please see {quote} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. {quote} The code which you pasted does not seem to upgrade component by component. 1. The below line finds all the containers that will be sent to AM for upgrades {code:java} containersToUpgrade = ServiceApiUtil .validateAndResolveCompsUpgrade(persistedService, components); {code} 2. The below request is sent to AM and the AM issues event to upgrade all the instances {code} CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = CompInstancesUpgradeRequestProto.newBuilder(); upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); {code} Instances are processing these events asynchronously, so all the instances will get upgraded without any order guarantees Having to upgrade component by component could only be supported if we have support for canceling the upgrade when there is any failure. Also in case the upgrade is cancelled, we need a way for the user to check the status of the upgrade. was (Author: csingh): [~eyang] Please see {quote} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. {quote} The code which you pasted does not seem to upgrade component by component. 1. The below line finds all the containers that will be sent to AM for upgrades {code:java} containersToUpgrade = ServiceApiUtil .validateAndResolveCompsUpgrade(persistedService, components); {code} 2. The below request is sent to AM and the AM issues event to upgrade all the instances {code} CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = CompInstancesUpgradeRequestProto.newBuilder(); upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); {code} Having to upgrade component by component could only be supported if we have support for canceling the upgrade when there is any failure. Also in case the upgrade is cancelled, we need a way for the user to check the status of the upgrade. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586193#comment-16586193 ] Chandni Singh commented on YARN-8298: - [~eyang] Please see {quote} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. {quote} The code which you pasted does not seem to upgrade component by component. 1. The below line finds all the containers that will be sent to AM for upgrades {code:java} containersToUpgrade = ServiceApiUtil .validateAndResolveCompsUpgrade(persistedService, components); {code} 2. The below request is sent to AM and the AM issues event to upgrade all the instances {code} CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = CompInstancesUpgradeRequestProto.newBuilder(); upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); {code} Having to upgrade component by component could only be supported if we have support for canceling the upgrade when there is any failure. Also in case the upgrade is cancelled, we need a way for the user to check the status of the upgrade. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586180#comment-16586180 ] Eric Yang commented on YARN-8298: - [~csingh] {quote} Even though the patch 2 referenced resolveCompsDependency, it wasn't being used to orchestrate upgrade of comp instances in any way. {quote} Quote from patch 2: {code} + public int actionUpgradeExpress(Service service) throws YarnException, + IOException { +int retries = 0; +ApplicationReport appReport = upgradePrecheck(service); +List components = ServiceApiUtil.resolveCompsDependency(service); +LOG.info("Upgrading {} with component list order: {}", +service.getName(), components); +ClientAMProtocol proxy = createAMProxy(service.getName(), appReport); +UpgradeServiceRequestProto.Builder requestBuilder = +UpgradeServiceRequestProto.newBuilder(); +requestBuilder.setVersion(service.getVersion()); +if (service.getState().equals(ServiceState.UPGRADING_AUTO_FINALIZE)) { + requestBuilder.setAutoFinalize(true); +} +UpgradeServiceResponseProto responseProto = proxy.upgrade( +requestBuilder.build()); +if (responseProto.hasError()) { + LOG.error("Service {} upgrade to version {} failed because {}", + service.getName(), service.getVersion(), responseProto.getError()); + throw new YarnException("Failed to upgrade service " + service.getName() + + " to version " + service.getVersion() + " because " + + responseProto.getError()); +} + +Service persistedService = getStatus(service.getName()); +List containersToUpgrade = null; +List containerIdsToUpgrade = new ArrayList<>(); +// AM cache changes might take a few seconds +while (retries < 30) { + try { +persistedService = getStatus(service.getName()); +retries++; +containersToUpgrade = ServiceApiUtil +.validateAndResolveCompsUpgrade(persistedService, components); + } catch (YarnException e) { +LOG.info("Waiting for service to become ready for upgrade, retries: {} / 30", retries); +try { + Thread.sleep(3000L); +} catch (InterruptedException ie) { +} + } +} +if (containersToUpgrade == null) { + LOG.error("No containers to upgrade."); + return EXIT_FALSE; +} +containersToUpgrade +.forEach(compInst -> containerIdsToUpgrade.add(compInst.getId())); +LOG.info("instances to upgrade {}", containerIdsToUpgrade); +CompInstancesUpgradeRequestProto.Builder upgradeRequestBuilder = +CompInstancesUpgradeRequestProto.newBuilder(); +upgradeRequestBuilder.addAllContainerIds(containerIdsToUpgrade); +proxy.upgrade(upgradeRequestBuilder.build()); +return EXIT_SUCCESS; + } {code} actionUpgradeExpress computes components order before calling backend. The logic is to upgrade component by component. Instance order is irrelevant within a component in this design. I think this match up fine with how we upgrade services today in Hadoop software. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586154#comment-16586154 ] Chandni Singh edited comment on YARN-8298 at 8/20/18 4:30 PM: -- [~eyang] Even though the patch 2 referenced resolveCompsDependency, it wasn't being used to orchestrate upgrade of comp instances in any way. If we need to orchestrate upgrade of instances based on component dependency, it requires the support to cancel the current upgrade. For, example if upgrade of compA fails, then compB upgrade will not be triggered as well. was (Author: csingh): [~eyang] Even though the patch 2 referenced resolveCompsDependency, it wasn't being used to orchestrate upgrade of comp instances in any way. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header
[ https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-8686: --- Priority: Critical (was: Major) > Queue Management API - not returning JSON or XML response data when passing > Accept header > - > > Key: YARN-8686 > URL: https://issues.apache.org/jira/browse/YARN-8686 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Akhil PB >Assignee: Akhil PB >Priority: Critical > > API should return JSON or XML response data based on Accept header. Instead, > API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586154#comment-16586154 ] Chandni Singh commented on YARN-8298: - [~eyang] Even though the patch 2 referenced resolveCompsDependency, it wasn't being used to orchestrate upgrade of comp instances in any way. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8242) YARN NM: OOM error while reading back the state store on recovery
[ https://issues.apache.org/jira/browse/YARN-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586123#comment-16586123 ] Hudson commented on YARN-8242: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14804 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14804/]) YARN-8242. YARN NM: OOM error while reading back the state store on (jlowe: rev 65e7469712be6cf393e29ef73cc94727eec81227) * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMTokenSecretManagerInNM.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/security/NMContainerTokenSecretManager.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMLeveldbStateStoreService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMMemoryStateStoreService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMStateStoreService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/localizer/ResourceLocalizationService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/NMNullStateStoreService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DeletionService.java * (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/recovery/TestNMLeveldbStateStoreService.java * (add) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/recovery/RecoveryIterator.java > YARN NM: OOM error while reading back the state store on recovery > - > > Key: YARN-8242 > URL: https://issues.apache.org/jira/browse/YARN-8242 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.6.0, 2.9.0, 2.6.5, 2.8.3, 3.1.0, 2.7.6, 3.0.2 >Reporter: Kanwaljeet Sachdev >Assignee: Pradeep Ambati >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8242.001.patch, YARN-8242.002.patch, > YARN-8242.003.patch, YARN-8242.004.patch, YARN-8242.005.patch, > YARN-8242.006.patch, YARN-8242.007.patch, YARN-8242.008.patch > > > On startup the NM reads its state store and builds a list of application in > the state store to process. If the number of applications in the state store > is large and have a lot of "state" connected to it the NM can run OOM and > never get to the point that it can start processing the recovery. > Since it never starts the recovery there is no way for the NM to ever pass > this point. It will require a change in heap size to get the NM started. > > Following is the stack trace > {code:java} > at java.lang.OutOfMemoryError. (OutOfMemoryError.java:48) at > com.google.protobuf.ByteString.copyFrom (ByteString.java:192) at > com.google.protobuf.CodedInputStream.readBytes (CodedInputStream.java:324) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. > (YarnProtos.java:47069) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. > (YarnProtos.java:47014) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom > (YarnProtos.java:47102) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom > (YarnProtos.java:47097) at com.google.protobuf.CodedInputStream.readMessage > (CodedInputStream.java:309) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. > (YarnProtos.java:41016) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. > (YarnProtos.java:40942) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom > (YarnProtos.java:41080) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom > (YarnProtos.java:41075) at
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586120#comment-16586120 ] Eric Yang commented on YARN-8298: - [~csingh] Thank you for the patch. In patch 3, it doesn't reference to resolveCompsDependency in the ServiceMaster upgrade logic. How does the component dependencies get resolved? > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8242) YARN NM: OOM error while reading back the state store on recovery
[ https://issues.apache.org/jira/browse/YARN-8242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586119#comment-16586119 ] Pradeep Ambati commented on YARN-8242: -- Thanks @jlowe for reviewing and commiting the patch. > YARN NM: OOM error while reading back the state store on recovery > - > > Key: YARN-8242 > URL: https://issues.apache.org/jira/browse/YARN-8242 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 2.6.0, 2.9.0, 2.6.5, 2.8.3, 3.1.0, 2.7.6, 3.0.2 >Reporter: Kanwaljeet Sachdev >Assignee: Pradeep Ambati >Priority: Critical > Fix For: 3.2.0, 3.1.2 > > Attachments: YARN-8242.001.patch, YARN-8242.002.patch, > YARN-8242.003.patch, YARN-8242.004.patch, YARN-8242.005.patch, > YARN-8242.006.patch, YARN-8242.007.patch, YARN-8242.008.patch > > > On startup the NM reads its state store and builds a list of application in > the state store to process. If the number of applications in the state store > is large and have a lot of "state" connected to it the NM can run OOM and > never get to the point that it can start processing the recovery. > Since it never starts the recovery there is no way for the NM to ever pass > this point. It will require a change in heap size to get the NM started. > > Following is the stack trace > {code:java} > at java.lang.OutOfMemoryError. (OutOfMemoryError.java:48) at > com.google.protobuf.ByteString.copyFrom (ByteString.java:192) at > com.google.protobuf.CodedInputStream.readBytes (CodedInputStream.java:324) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. > (YarnProtos.java:47069) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto. > (YarnProtos.java:47014) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom > (YarnProtos.java:47102) at > org.apache.hadoop.yarn.proto.YarnProtos$StringStringMapProto$1.parsePartialFrom > (YarnProtos.java:47097) at com.google.protobuf.CodedInputStream.readMessage > (CodedInputStream.java:309) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. > (YarnProtos.java:41016) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto. > (YarnProtos.java:40942) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom > (YarnProtos.java:41080) at > org.apache.hadoop.yarn.proto.YarnProtos$ContainerLaunchContextProto$1.parsePartialFrom > (YarnProtos.java:41075) at com.google.protobuf.CodedInputStream.readMessage > (CodedInputStream.java:309) at > org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto. > (YarnServiceProtos.java:24517) at > org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto. > (YarnServiceProtos.java:24464) at > org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto$1.parsePartialFrom > (YarnServiceProtos.java:24568) at > org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto$1.parsePartialFrom > (YarnServiceProtos.java:24563) at > com.google.protobuf.AbstractParser.parsePartialFrom (AbstractParser.java:141) > at com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:176) at > com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:188) at > com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:193) at > com.google.protobuf.AbstractParser.parseFrom (AbstractParser.java:49) at > org.apache.hadoop.yarn.proto.YarnServiceProtos$StartContainerRequestProto.parseFrom > (YarnServiceProtos.java:24739) at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainerState > (NMLeveldbStateStoreService.java:217) at > org.apache.hadoop.yarn.server.nodemanager.recovery.NMLeveldbStateStoreService.loadContainersState > (NMLeveldbStateStoreService.java:170) at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.recover > (ContainerManagerImpl.java:253) at > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit > (ContainerManagerImpl.java:237) at > org.apache.hadoop.service.AbstractService.init (AbstractService.java:163) at > org.apache.hadoop.service.CompositeService.serviceInit > (CompositeService.java:107) at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit > (NodeManager.java:255) at org.apache.hadoop.service.AbstractService.init > (AbstractService.java:163) at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager > (NodeManager.java:474) at > org.apache.hadoop.yarn.server.nodemanager.NodeManager.main > (NodeManager.java:521){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586076#comment-16586076 ] Eric Yang commented on YARN-8648: - [~Jim_Brennan] {quote} I think this is mitigated if we use the "cgroup" section of the container-exectutor.cfg to constrain it. This is currently used to enable updating params, but I think it could be used for this as well.It already defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, hadoop-yarn). We could either add another config parameter to define the list of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can parse /proc/mounts to determine the full list. I think it's safer to add the config parameter. {quote} The proposal looks good. As long as the default list can match docker's cgroup usage when docker is enabled instead of being empty list. I think this solution can work. > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8648) Container cgroups are leaked when using docker
[ https://issues.apache.org/jira/browse/YARN-8648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586060#comment-16586060 ] Jim Brennan commented on YARN-8648: --- Thanks [~eyang]! My main concern about the minimal fix is the security aspect, since we will need to add an option to container-executor to tell it to delete all cgroups with a particular name as root (since docker will create them as root). I think this is mitigated if we use the "cgroup" section of the container-exectutor.cfg to constrain it. This is currently used to enable updating params, but I think it could be used for this as well. It already defines the CGROUPS_ROOT (e.g., /sys/fs/cgroup), and the YARN_HIERARCHY (e.g, hadoop-yarn). We could either add another config parameter to define the list of hierarchies to clean up (e.g, cpuset, freezer, hugetlb, etc...), or we can parse /proc/mounts to determine the full list. I think it's safer to add the config parameter. I will start working on this version unless there are objections? > Container cgroups are leaked when using docker > -- > > Key: YARN-8648 > URL: https://issues.apache.org/jira/browse/YARN-8648 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Labels: Docker > > When you run with docker and enable cgroups for cpu, docker creates cgroups > for all resources on the system, not just for cpu. For instance, if the > {{yarn.nodemanager.linux-container-executor.cgroups.hierarchy=/hadoop-yarn}}, > the nodemanager will create a cgroup for each container under > {{/sys/fs/cgroup/cpu/hadoop-yarn}}. In the docker case, we pass this path > via the {{--cgroup-parent}} command line argument. Docker then creates a > cgroup for the docker container under that, for instance: > {{/sys/fs/cgroup/cpu/hadoop-yarn/container_id/docker_container_id}}. > When the container exits, docker cleans up the {{docker_container_id}} > cgroup, and the nodemanager cleans up the {{container_id}} cgroup, All is > good under {{/sys/fs/cgroup/hadoop-yarn}}. > The problem is that docker also creates that same hierarchy under every > resource under {{/sys/fs/cgroup}}. On the rhel7 system I am using, these > are: blkio, cpuset, devices, freezer, hugetlb, memory, net_cls, net_prio, > perf_event, and systemd.So for instance, docker creates > {{/sys/fs/cgroup/cpuset/hadoop-yarn/container_id/docker_container_id}}, but > it only cleans up the leaf cgroup {{docker_container_id}}. Nobody cleans up > the {{container_id}} cgroups for these other resources. On one of our busy > clusters, we found > 100,000 of these leaked cgroups. > I found this in our 2.8-based version of hadoop, but I have been able to > repro with current hadoop. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8664) ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting
[ https://issues.apache.org/jira/browse/YARN-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585921#comment-16585921 ] genericqa commented on YARN-8664: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 6s{color} | {color:red} Docker failed to build yetus/hadoop:749e106. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | YARN-8664 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936272/YARN-8664-branch-2.8.01.patch | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/21634/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting > - > > Key: YARN-8664 > URL: https://issues.apache.org/jira/browse/YARN-8664 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.2 > Environment: >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-8664-branch-2.8.001.pathch, > YARN-8664-branch-2.8.01.patch, YARN-8664-branch-2.8.2.001.patch, > YARN-8664-branch-2.8.2.002.patch > > > ResourceManager logs about exception is: > {code:java} > 2018-08-09 00:52:30,746 WARN [IPC Server handler 5 on 8030] > org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8030, call Call#305638 > Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from > 11.13.73.101:51083 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto.isInitialized(YarnProtos.java:6402) > at > org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto$Builder.build(YarnProtos.java:6642) > at > org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.mergeLocalToProto(ResourcePBImpl.java:254) > at > org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getProto(ResourcePBImpl.java:61) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.convertToProtoFormat(NodeReportPBImpl.java:313) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:264) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.convertToProtoFormat(AllocateResponsePBImpl.java:714) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.access$400(AllocateResponsePBImpl.java:69) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:680) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:669) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$AllocateResponseProto$Builder.addAllUpdatedNodes(YarnServiceProtos.java:12846) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToBuilder(AllocateResponsePBImpl.java:145) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToProto(AllocateResponsePBImpl.java:176) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.getProto(AllocateResponsePBImpl.java:97) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:61) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) >
[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585910#comment-16585910 ] Weiwei Yang commented on YARN-8685: --- Hi [~Tao Yang] Does it make more sense to extend NM containers rest API for this? E.g add a filter (by state). If we maintain two classes for container info, we'll end up modifying both if anything changed. Better to avoid that. What do you think? > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8685.001.patch > > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585907#comment-16585907 ] genericqa commented on YARN-8685: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 7 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 53s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 15m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 11s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 4s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server: The patch generated 4 new + 42 unchanged - 0 fixed = 46 total (was 42) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 16s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 53s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 42s{color} | {color:green} hadoop-yarn-server-router in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}150m 28s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8685 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936257/YARN-8685.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 321bb3677aa6 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8685: -- Issue Type: Improvement (was: Bug) > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Improvement > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8685.001.patch > > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header
[ https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585898#comment-16585898 ] Weiwei Yang commented on YARN-8686: --- Hmm, it is not? I thought I used this before. Have you tried "contentType" header? > Queue Management API - not returning JSON or XML response data when passing > Accept header > - > > Key: YARN-8686 > URL: https://issues.apache.org/jira/browse/YARN-8686 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Akhil PB >Assignee: Akhil PB >Priority: Major > > API should return JSON or XML response data based on Accept header. Instead, > API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8664) ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting
[ https://issues.apache.org/jira/browse/YARN-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiwei Yang updated YARN-8664: -- Attachment: YARN-8664-branch-2.8.01.patch > ApplicationMasterProtocolPBServiceImpl#allocate throw NPE when NM losting > - > > Key: YARN-8664 > URL: https://issues.apache.org/jira/browse/YARN-8664 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.8.2 > Environment: >Reporter: Jiandan Yang >Assignee: Jiandan Yang >Priority: Major > Attachments: YARN-8664-branch-2.8.001.pathch, > YARN-8664-branch-2.8.01.patch, YARN-8664-branch-2.8.2.001.patch, > YARN-8664-branch-2.8.2.002.patch > > > ResourceManager logs about exception is: > {code:java} > 2018-08-09 00:52:30,746 WARN [IPC Server handler 5 on 8030] > org.apache.hadoop.ipc.Server: IPC Server handler 5 on 8030, call Call#305638 > Retry#0 org.apache.hadoop.yarn.api.ApplicationMasterProtocolPB.allocate from > 11.13.73.101:51083 > java.lang.NullPointerException > at > org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto.isInitialized(YarnProtos.java:6402) > at > org.apache.hadoop.yarn.proto.YarnProtos$ResourceProto$Builder.build(YarnProtos.java:6642) > at > org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.mergeLocalToProto(ResourcePBImpl.java:254) > at > org.apache.hadoop.yarn.api.records.impl.pb.ResourcePBImpl.getProto(ResourcePBImpl.java:61) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.convertToProtoFormat(NodeReportPBImpl.java:313) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToBuilder(NodeReportPBImpl.java:264) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.mergeLocalToProto(NodeReportPBImpl.java:287) > at > org.apache.hadoop.yarn.api.records.impl.pb.NodeReportPBImpl.getProto(NodeReportPBImpl.java:224) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.convertToProtoFormat(AllocateResponsePBImpl.java:714) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.access$400(AllocateResponsePBImpl.java:69) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:680) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl$6$1.next(AllocateResponsePBImpl.java:669) > at > com.google.protobuf.AbstractMessageLite$Builder.checkForNullValues(AbstractMessageLite.java:336) > at > com.google.protobuf.AbstractMessageLite$Builder.addAll(AbstractMessageLite.java:323) > at > org.apache.hadoop.yarn.proto.YarnServiceProtos$AllocateResponseProto$Builder.addAllUpdatedNodes(YarnServiceProtos.java:12846) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToBuilder(AllocateResponsePBImpl.java:145) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.mergeLocalToProto(AllocateResponsePBImpl.java:176) > at > org.apache.hadoop.yarn.api.protocolrecords.impl.pb.AllocateResponsePBImpl.getProto(AllocateResponsePBImpl.java:97) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.allocate(ApplicationMasterProtocolPBServiceImpl.java:61) > at > org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:99) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:846) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:789) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1804) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2457) > {code} > ApplicationMasterService#allocate will call AllocateResponse#setUpdatedNodes > when NM losting, and AllocateResponse#getProto will call > ResourceBPImpl#getProto to transform NodeReportPBImpl#capacity into format of > PB . Because ResourcePBImpl is not thread safe and > multiple AM will call allocate at the same time, ResourcePBImpl#getProto may > throw NullPointerException or UnsupportedOperationException. > I wrote a test code which can reproduce exception. > {code:java} > @Test > public void testResource1() throws InterruptedException { >
[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585870#comment-16585870 ] Weiwei Yang commented on YARN-8683: --- Hi [~Tao Yang] Thanks for the patch, it looks good. Do you have a screenshot how this looks like on the UI ? I would like to take a look. Just want to make sure the look and feel won't confuse people that not using {{SchedulingRequest}}. Thanks! > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch, YARN-8683.002.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585857#comment-16585857 ] Weiwei Yang commented on YARN-7494: --- Hi [~sunilg], looks good, +1 once the checkstyle issues are fixed. > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, > YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, > YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, > YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585846#comment-16585846 ] genericqa commented on YARN-8683: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 31s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 31s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 38s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 50s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestIncreaseAllocationExpirer | | | hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacitySchedulerSchedulingRequestUpdate | | | hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler | | | hadoop.yarn.server.resourcemanager.TestApplicationMasterService | | | hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8683 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936254/YARN-8683.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux a8b161ef3702 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e3d73bb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Updated] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing Accept header
[ https://issues.apache.org/jira/browse/YARN-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akhil PB updated YARN-8686: --- Summary: Queue Management API - not returning JSON or XML response data when passing Accept header (was: Queue Management API - not returning JSON or XML response data when passing accept header) > Queue Management API - not returning JSON or XML response data when passing > Accept header > - > > Key: YARN-8686 > URL: https://issues.apache.org/jira/browse/YARN-8686 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Akhil PB >Assignee: Akhil PB >Priority: Major > > API should return JSON or XML response data based on Accept header. Instead, > API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8686) Queue Management API - not returning JSON or XML response data when passing accept header
Akhil PB created YARN-8686: -- Summary: Queue Management API - not returning JSON or XML response data when passing accept header Key: YARN-8686 URL: https://issues.apache.org/jira/browse/YARN-8686 Project: Hadoop YARN Issue Type: Sub-task Components: yarn Reporter: Akhil PB Assignee: Akhil PB API should return JSON or XML response data based on Accept header. Instead, API returns plain text for success as well as error scenarios. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8129) Improve error message for invalid value in fields attribute
[ https://issues.apache.org/jira/browse/YARN-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585825#comment-16585825 ] Abhishek Modi commented on YARN-8129: - Thanks [~suma.shivaprasad] for the review. [~suma.shivaprasad] [~rohithsharma] [~vrushalic] could you please commit it if it looks good. > Improve error message for invalid value in fields attribute > --- > > Key: YARN-8129 > URL: https://issues.apache.org/jira/browse/YARN-8129 > Project: Hadoop YARN > Issue Type: Sub-task > Components: ATSv2 >Reporter: Charan Hebri >Assignee: Abhishek Modi >Priority: Minor > Attachments: YARN-8129.001.patch > > > Query with invalid values for the 'fields' attributes throws a message that > isn't very informative. > Reader log, > {noformat} > 2018-04-09 08:59:46,069 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:getEntities(595)) - Received URL > /ws/v2/timeline/users/hrt_qa/flows/test_flow/apps?limit=3=INFOS from > user hrt_qa > 2018-04-09 08:59:46,070 INFO reader.TimelineReaderWebServices > (TimelineReaderWebServices.java:handleException(173)) - Processed URL > /ws/v2/timeline/users/hrt_qa/flows/test_flow/apps?limit=3=INFOS but > encountered exception (Took 1 ms.){noformat} > Here INFOS is the invalid value for the fields attribute. > Response, > {noformat} > { > "exception": "BadRequestException", > "message": "java.lang.Exception: No enum constant > org.apache.hadoop.yarn.server.timelineservice.storage.TimelineReader.Field.INFOS", > "javaClassName": "org.apache.hadoop.yarn.webapp.BadRequestException" > }{noformat} > The message shouldn't ideally contain the enum information. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585789#comment-16585789 ] genericqa commented on YARN-7494: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 53s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 17s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 40s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 35 new + 671 unchanged - 4 fixed = 706 total (was 675) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 49s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 64m 46s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 31s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-7494 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936240/YARN-7494.18.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 33645c430996 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e3d73bb | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21631/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/21631/testReport/ | | Max. process+thread count | 903 (vs. ulimit of 1) | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager U:
[jira] [Commented] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585762#comment-16585762 ] Tao Yang commented on YARN-8685: Attached v1 patch for review. > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8685.001.patch > > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8685: --- Attachment: YARN-8685.001.patch > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8685.001.patch > > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8683: --- Attachment: YARN-8683.002.patch > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch, YARN-8683.002.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585709#comment-16585709 ] Tao Yang commented on YARN-8683: The failed UT seems unrelated to this patch. Attached v2 patch to correct the checkstyle. > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch, YARN-8683.002.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8683: --- Attachment: (was: YARN-8683.002.patch) > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch, YARN-8683.002.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8683: --- Attachment: YARN-8683.002.patch > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch, YARN-8683.002.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
[ https://issues.apache.org/jira/browse/YARN-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8685: --- Description: Currently we can only query running containers from NM containers REST API, but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We have the requirements to get all containers allocated on specified nodes for debugging. I want to add a "includeContainers" query param (default false) for nodes/node REST API in RMWebServices, so that we can get valid containers on nodes if "includeContainers=true" specified. (was: Currently we can only query running containers from NM containers REST API, but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We have the requirements to get all containers allocated on specified nodes for debugging or managing. I think we can add a "includeContainers" query param (default false) for nodes/node REST API in RMWebServices, so that we can get valid containers on nodes if "includeContainers=true" specified.) > Add containers query support for nodes/node REST API in RMWebServices > - > > Key: YARN-8685 > URL: https://issues.apache.org/jira/browse/YARN-8685 > Project: Hadoop YARN > Issue Type: Bug > Components: restapi >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > > Currently we can only query running containers from NM containers REST API, > but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We > have the requirements to get all containers allocated on specified nodes for > debugging. I want to add a "includeContainers" query param (default false) > for nodes/node REST API in RMWebServices, so that we can get valid containers > on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8685) Add containers query support for nodes/node REST API in RMWebServices
Tao Yang created YARN-8685: -- Summary: Add containers query support for nodes/node REST API in RMWebServices Key: YARN-8685 URL: https://issues.apache.org/jira/browse/YARN-8685 Project: Hadoop YARN Issue Type: Bug Components: restapi Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently we can only query running containers from NM containers REST API, but can't get the valid containers which are in ALLOCATED/ACQUIRED state. We have the requirements to get all containers allocated on specified nodes for debugging or managing. I think we can add a "includeContainers" query param (default false) for nodes/node REST API in RMWebServices, so that we can get valid containers on nodes if "includeContainers=true" specified. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585662#comment-16585662 ] genericqa commented on YARN-8683: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 35s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 37s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: The patch generated 3 new + 126 unchanged - 0 fixed = 129 total (was 126) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 27s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}130m 23s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.reservation.TestCapacityOverTimePolicy | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:ba1ab08 | | JIRA Issue | YARN-8683 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12936228/YARN-8683.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4b6344a3ccc5 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4aacbff | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/21630/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt | | unit |
[jira] [Commented] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585620#comment-16585620 ] Sunil Govindan commented on YARN-7494: -- Fixed test case. Attaching new patch, > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, > YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, > YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, > YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7494) Add muti node lookup support for better placement
[ https://issues.apache.org/jira/browse/YARN-7494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil Govindan updated YARN-7494: - Attachment: YARN-7494.18.patch > Add muti node lookup support for better placement > - > > Key: YARN-7494 > URL: https://issues.apache.org/jira/browse/YARN-7494 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Sunil Govindan >Assignee: Sunil Govindan >Priority: Major > Attachments: YARN-7494.001.patch, YARN-7494.002.patch, > YARN-7494.003.patch, YARN-7494.004.patch, YARN-7494.005.patch, > YARN-7494.006.patch, YARN-7494.007.patch, YARN-7494.008.patch, > YARN-7494.009.patch, YARN-7494.010.patch, YARN-7494.11.patch, > YARN-7494.12.patch, YARN-7494.13.patch, YARN-7494.14.patch, > YARN-7494.15.patch, YARN-7494.16.patch, YARN-7494.17.patch, > YARN-7494.18.patch, YARN-7494.v0.patch, YARN-7494.v1.patch, > multi-node-designProposal.png > > > Instead of single node, for effectiveness we can consider a multi node lookup > based on partition to start with. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585607#comment-16585607 ] genericqa commented on YARN-8298: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 8s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 36s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 37s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 7s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 9 new + 136 unchanged - 0 fixed = 145 total (was 136) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 52s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 40s{color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common generated 1 new + 4189 unchanged - 0 fixed = 4190 total (was 4189) {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 13s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 24m 17s{color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 41s{color} | {color:red} hadoop-yarn-services-core in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 50s{color} | {color:green} hadoop-yarn-services-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}120m 17s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.client.cli.TestYarnCLI | | | hadoop.yarn.service.TestServiceManager | \\ \\ || Subsystem || Report/Notes || | Docker |
[jira] [Created] (YARN-8684) Support for setting priority for services in spec file
Rohith Sharma K S created YARN-8684: --- Summary: Support for setting priority for services in spec file Key: YARN-8684 URL: https://issues.apache.org/jira/browse/YARN-8684 Project: Hadoop YARN Issue Type: Improvement Reporter: Rohith Sharma K S YARN service spec file doesn't allow setting priority. It would be nice if yarn services allows to set priority so that some of critical services such as system-services gets higher preference -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Yang updated YARN-8683: --- Attachment: YARN-8683.001.patch > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
[ https://issues.apache.org/jira/browse/YARN-8683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585487#comment-16585487 ] Tao Yang commented on YARN-8683: Attached v1 patch for review. Updates: (1) Total outstanding resource requests info on app attempt page: add pending scheduling requests and "ExecutionType"/"AllocationTags"/"PlacementConstraint" columns (2) Remove redundant fields ("executionType" & "enforceExecutionType") which are replaced with executionTypeRequest in ResourceRequestInfo > Support scheduling request for outstanding requests info in RMAppAttemptBlock > - > > Key: YARN-8683 > URL: https://issues.apache.org/jira/browse/YARN-8683 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 3.2.0 >Reporter: Tao Yang >Assignee: Tao Yang >Priority: Major > Attachments: YARN-8683.001.patch > > > Currently outstanding requests info in app attempt page only show pending > resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-8683) Support scheduling request for outstanding requests info in RMAppAttemptBlock
Tao Yang created YARN-8683: -- Summary: Support scheduling request for outstanding requests info in RMAppAttemptBlock Key: YARN-8683 URL: https://issues.apache.org/jira/browse/YARN-8683 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang Currently outstanding requests info in app attempt page only show pending resource requests, pending scheduling requests should be shown here too. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585469#comment-16585469 ] Chandni Singh commented on YARN-8298: - Patch 3 adds {{EXPRESS_UPGRADING}} state that will let the ServiceMaster know to perform upgrade of all the instances of the service. > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-8298) Yarn Service Upgrade: Support express upgrade of a service
[ https://issues.apache.org/jira/browse/YARN-8298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chandni Singh updated YARN-8298: Attachment: YARN-8298.003.patch > Yarn Service Upgrade: Support express upgrade of a service > -- > > Key: YARN-8298 > URL: https://issues.apache.org/jira/browse/YARN-8298 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.1.1 >Reporter: Chandni Singh >Assignee: Chandni Singh >Priority: Major > Attachments: YARN-8298.001.patch, YARN-8298.002.patch, > YARN-8298.003.patch > > > Currently service upgrade involves 2 steps > * initiate upgrade by providing new spec > * trigger upgrade of each instance/component > > We need to add the ability to upgrade the service in one shot: > # Aborting the upgrade will not be supported > # Upgrade finalization will be done automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org