[jira] [Commented] (YARN-4373) Jobs can be temporarily forgotten during recovery
[ https://issues.apache.org/jira/browse/YARN-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067712#comment-15067712 ] Rohith Sharma K S commented on YARN-4373: - I am seriously surprised with this issue where applications can not be found in {{rmcontext}} during recovery. Currently, Active services will not get started as long as recovery finishes which means none of the ports are open to listen. Once applications are recovery either it can be completed apps or running, both are added to {{rmcontext}}. Would you provide full RM logs for this issue? > Jobs can be temporarily forgotten during recovery > - > > Key: YARN-4373 > URL: https://issues.apache.org/jira/browse/YARN-4373 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton >Priority: Critical > > The RM becomes available to service requests before state store recovery is > started. Before recovery and during the recovery period, it's possible for a > client to request an application report for a running application to which > the RM will respond that the application in unknown. > I'm seeing this issue with Oozie during an RM failover. Until the active > finishes recovery, it reports erroneous information to Oozie, which doesn't > have context to know that it should just try again later. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4477) FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling.
[ https://issues.apache.org/jira/browse/YARN-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067646#comment-15067646 ] Hudson commented on YARN-4477: -- FAILURE: Integrated in Hadoop-trunk-Commit #9012 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9012/]) YARN-4477. FairScheduler: Handle condition which can result in an (arun suresh: rev e88422df45550f788ae8dd73aec84bde28012aeb) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/TestFairScheduler.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSAppAttempt.java > FairScheduler: Handle condition which can result in an infinite loop in > attemptScheduling. > -- > > Key: YARN-4477 > URL: https://issues.apache.org/jira/browse/YARN-4477 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie > Fix For: 2.8.0 > > Attachments: YARN-4477.001.patch, YARN-4477.002.patch, > YARN-4477.003.patch, YARN-4477.004.patch > > > This problem is introduced by YARN-4270 which add limitation on reservation. > In FSAppAttempt.reserve(): > {code} > if (!reservationExceedsThreshold(node, type)) { > LOG.info("Making reservation: node=" + node.getNodeName() + > " app_id=" + getApplicationId()); > if (!alreadyReserved) { > getMetrics().reserveResource(getUser(), container.getResource()); > RMContainer rmContainer = > super.reserve(node, priority, null, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } else { > RMContainer rmContainer = node.getReservedContainer(); > super.reserve(node, priority, rmContainer, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } > } > {code} > If reservation over threshod, current node will not set reservation. > But in attemptScheduling in FairSheduler: > {code} > while (node.getReservedContainer() == null) { > boolean assignedContainer = false; > if (!queueMgr.getRootQueue().assignContainer(node).equals( > Resources.none())) { > assignedContainers++; > assignedContainer = true; > > } > > if (!assignedContainer) { break; } > if (!assignMultiple) { break; } > if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; } > } > {code} > assignContainer(node) still return FairScheduler.CONTAINER_RESERVED, which not > equals to Resources.none(). > As a result, if multiple assign is enabled and maxAssign is unlimited, this > while loop would never break. > I suppose that assignContainer(node) should return Resource.none rather than > CONTAINER_RESERVED when the attempt doesn't take the reservation because of > the limitation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4477) FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling.
[ https://issues.apache.org/jira/browse/YARN-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067629#comment-15067629 ] Arun Suresh commented on YARN-4477: --- Thanks for updating the patch [~Tao Jie]. Verified that the tests run fine locally.. committed to trunk, branch-2 and branch-2.8 > FairScheduler: Handle condition which can result in an infinite loop in > attemptScheduling. > -- > > Key: YARN-4477 > URL: https://issues.apache.org/jira/browse/YARN-4477 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie > Fix For: 2.8.0 > > Attachments: YARN-4477.001.patch, YARN-4477.002.patch, > YARN-4477.003.patch, YARN-4477.004.patch > > > This problem is introduced by YARN-4270 which add limitation on reservation. > In FSAppAttempt.reserve(): > {code} > if (!reservationExceedsThreshold(node, type)) { > LOG.info("Making reservation: node=" + node.getNodeName() + > " app_id=" + getApplicationId()); > if (!alreadyReserved) { > getMetrics().reserveResource(getUser(), container.getResource()); > RMContainer rmContainer = > super.reserve(node, priority, null, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } else { > RMContainer rmContainer = node.getReservedContainer(); > super.reserve(node, priority, rmContainer, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } > } > {code} > If reservation over threshod, current node will not set reservation. > But in attemptScheduling in FairSheduler: > {code} > while (node.getReservedContainer() == null) { > boolean assignedContainer = false; > if (!queueMgr.getRootQueue().assignContainer(node).equals( > Resources.none())) { > assignedContainers++; > assignedContainer = true; > > } > > if (!assignedContainer) { break; } > if (!assignMultiple) { break; } > if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; } > } > {code} > assignContainer(node) still return FairScheduler.CONTAINER_RESERVED, which not > equals to Resources.none(). > As a result, if multiple assign is enabled and maxAssign is unlimited, this > while loop would never break. > I suppose that assignContainer(node) should return Resource.none rather than > CONTAINER_RESERVED when the attempt doesn't take the reservation because of > the limitation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4477) FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling.
[ https://issues.apache.org/jira/browse/YARN-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arun Suresh updated YARN-4477: -- Summary: FairScheduler: Handle condition which can result in an infinite loop in attemptScheduling. (was: FairScheduler: encounter infinite loop in attemptScheduling) > FairScheduler: Handle condition which can result in an infinite loop in > attemptScheduling. > -- > > Key: YARN-4477 > URL: https://issues.apache.org/jira/browse/YARN-4477 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-4477.001.patch, YARN-4477.002.patch, > YARN-4477.003.patch, YARN-4477.004.patch > > > This problem is introduced by YARN-4270 which add limitation on reservation. > In FSAppAttempt.reserve(): > {code} > if (!reservationExceedsThreshold(node, type)) { > LOG.info("Making reservation: node=" + node.getNodeName() + > " app_id=" + getApplicationId()); > if (!alreadyReserved) { > getMetrics().reserveResource(getUser(), container.getResource()); > RMContainer rmContainer = > super.reserve(node, priority, null, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } else { > RMContainer rmContainer = node.getReservedContainer(); > super.reserve(node, priority, rmContainer, container); > node.reserveResource(this, priority, rmContainer); > setReservation(node); > } > } > {code} > If reservation over threshod, current node will not set reservation. > But in attemptScheduling in FairSheduler: > {code} > while (node.getReservedContainer() == null) { > boolean assignedContainer = false; > if (!queueMgr.getRootQueue().assignContainer(node).equals( > Resources.none())) { > assignedContainers++; > assignedContainer = true; > > } > > if (!assignedContainer) { break; } > if (!assignMultiple) { break; } > if ((assignedContainers >= maxAssign) && (maxAssign > 0)) { break; } > } > {code} > assignContainer(node) still return FairScheduler.CONTAINER_RESERVED, which not > equals to Resources.none(). > As a result, if multiple assign is enabled and maxAssign is unlimited, this > while loop would never break. > I suppose that assignContainer(node) should return Resource.none rather than > CONTAINER_RESERVED when the attempt doesn't take the reservation because of > the limitation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067616#comment-15067616 ] Hadoop QA commented on YARN-4062: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 4s {color} | {color:red} YARN-4062 does not apply to YARN-2928. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778990/YARN-4062-YARN-2928.1.patch | | JIRA Issue | YARN-4062 | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10070/console | This message was automatically generated. > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.1.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
[ https://issues.apache.org/jira/browse/YARN-4495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sandflee updated YARN-4495: --- Description: now RM may pass InvalidResourceRequestException to AM or just ignore the change request, the former will cause AMRMClientAsync down. and the latter will leave AM waiting for the relay.(was: now RM may pass InvalidResourceRequestException to AM or just ignore the change request, the former will cause AMRMClientAsync down. and the latter will leave am waiting for the relay. ) > add a way to tell AM container increase/decrease request is invalid > --- > > Key: YARN-4495 > URL: https://issues.apache.org/jira/browse/YARN-4495 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: sandflee > > now RM may pass InvalidResourceRequestException to AM or just ignore the > change request, the former will cause AMRMClientAsync down. and the latter > will leave AM waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-4062: - Attachment: YARN-4062-YARN-2928.1.patch Attaching patch v1. > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.1.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4183) Enabling generic application history forces every job to get a timeline service delegation token
[ https://issues.apache.org/jira/browse/YARN-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067602#comment-15067602 ] Naganarasimha G R commented on YARN-4183: - [~sjlee0], Sorry missed this comment earlier, My thoughts are also in line with yours, but just that the documentation is not capturing this, Existing : {quote} Indicate to clients whether timeline service is enabled or not. If enabled, clients will put entities and events to the timeline server {quote} i feel we need to capture it as {quote} In the server side it indicates whether timeline service is enabled or not. And in the client side, users can enable it to indicate whether they want to use timeline service. If enabled in the clientside and security is also enabled, then yarn client tries to fetch the delegation tokens for the timeline server. {quote} modifications are welcome also there {{yarn.timeline-service.client.best-effort}} is wrongly documented as {{yarn.timeline-service.best-effort}}. So shall i get these things corrected as part of this jira ? > Enabling generic application history forces every job to get a timeline > service delegation token > > > Key: YARN-4183 > URL: https://issues.apache.org/jira/browse/YARN-4183 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.7.1 >Reporter: Mit Desai >Assignee: Mit Desai > Attachments: YARN-4183.1.patch > > > When enabling just the Generic History Server and not the timeline server, > the system metrics publisher will not publish the events to the timeline > store as it checks if the timeline server and system metrics publisher are > enabled before creating a timeline client. > To make it work, if the timeline service flag is turned on, it will force > every yarn application to get a delegation token. > Instead of checking if timeline service is enabled, we should be checking if > application history server is enabled. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067599#comment-15067599 ] Jun Gong commented on YARN-4494: Thanks for the all suggestions. [~Naganarasimha] I plan to keep completed apps in a map and recover them asynchronously after recovering apps which are not yet completed, it is same as the solution 2 which you mentioned. IMO solution 1 is more complex than it. [~templedf] It is really a good question. I did not notice the problem when I created the issue, because I thought it was not a big problem. I prefer the method that forces immediate recovery of that app or block until it's recovered, because it will make no changes to the behaviors for client. > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067592#comment-15067592 ] Naganarasimha G R commented on YARN-4494: - [~templedf], Thanks for elaborating your thoughts bq. block until it's recovered. I think this option is better > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4495) add a way to tell AM container increase/decrease request is invalid
sandflee created YARN-4495: -- Summary: add a way to tell AM container increase/decrease request is invalid Key: YARN-4495 URL: https://issues.apache.org/jira/browse/YARN-4495 Project: Hadoop YARN Issue Type: Improvement Reporter: sandflee now RM may pass InvalidResourceRequestException to AM or just ignore the change request, the former will cause AMRMClientAsync down. and the latter will leave am waiting for the relay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-1856) cgroups based memory monitoring for containers
[ https://issues.apache.org/jira/browse/YARN-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067584#comment-15067584 ] Varun Vasudev commented on YARN-1856: - [~kasha] - we haven't provided a flag for using oom_control, but we did provide a control to set swappiness(which currently is a config setting). Ideally oom_control, swappiness would be set by the AM/YARN client and should be container specific settings. In general, we need an API to set container executor specific settings - we've seen a need for this when adding Docker support and now for CGroups settings as well. If you'd like to work on it, is it possible to come up with an abstraction that'll solve the Docker issues as well? [~sidharta-s] and I can help provide context on the Docker use case. > cgroups based memory monitoring for containers > -- > > Key: YARN-1856 > URL: https://issues.apache.org/jira/browse/YARN-1856 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.3.0 >Reporter: Karthik Kambatla >Assignee: Varun Vasudev > Fix For: 2.9.0 > > Attachments: YARN-1856.001.patch, YARN-1856.002.patch, > YARN-1856.003.patch, YARN-1856.004.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067576#comment-15067576 ] Daniel Templeton commented on YARN-4494: [~Naganarasimha], except in the case of Oozie, which really doesn't like it when the RM says its jobs don't exist. And I'm not saying that it's an either-or. You can mitigate the issue by having a call that requests an app which hasn't been recovered yet either force immediate recovery of that app or block until it's recovered. > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067574#comment-15067574 ] Naganarasimha G R commented on YARN-4494: - bq. One reason why the app recovery is synchronous is that asynchronous recovery can cause the RM to tell a client that a job doesn't exist, when it really just hasn't been recovered yet, which is an issue even with completed jobs. How are you planning to handle that? Well, yes this will be the limitation of the solution but when timelineservice is enabled then if the data is not there in RM then it tries to mitigate the issue. Also when we compare the 2 evils *RM fail over being slower* and *RM to tell a client that a job doesn't exist momentarily* i feel former is more serious and later can be avoided using ATS. > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067568#comment-15067568 ] Hadoop QA commented on YARN-4100: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 54s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 53s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s {color} | {color:green} hadoop-yarn-site in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 11s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 9s {color} | {color:green} hadoop-yarn-site in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 26s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778981/YARN-4100.v1.004.patch | | JIRA Issue | YARN-4100 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit xml | | uname | Linux 5267cac0d62f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 114b590 |
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067567#comment-15067567 ] Rohith Sharma K S commented on YARN-4494: - +1 for the idea. It really helps faster RM service availability. We see in our test cluster takes around 15 or more seconds to make RM service up when state store has completed apps around 100K apps. Best way to make service up faster is configure less max-completed apps and use Timeline Server to store history data. > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4100: Attachment: NodeLabel.html YARN-4100.v1.004.patch Thanks for the feedback [~leftnoteasy], I have updated a patch covering most of your comments , bq. for the first 2 points ... IMO indexing order should be as follows : {code} Overview Features Configurations Setting up ResourceManager to enable Node Labels: Basic configurations to enable Node Labels Add/modify node labels list to YARN Add/modify node-to-labels mapping to YARN Configuration of Schedulers for node labels Capacity Scheduler Configuration Specifying node label for application Monitoring Monitoring through web UI Monitoring through commandline Useful links {code} Considering this i felt it was still under *Setting up ResourceManager to enable Node Labels*, Thoughts? and have added the minimal configuration in the first part(Basic configurations) in this patch > Add Documentation for Distributed and Delegated-Centralized Node Labels > feature > --- > > Key: YARN-4100 > URL: https://issues.apache.org/jira/browse/YARN-4100 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: NodeLabel.html, YARN-4100.v1.001.patch, > YARN-4100.v1.002.patch, YARN-4100.v1.003.patch, YARN-4100.v1.004.patch > > > Add Documentation for Distributed Node Labels feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4100: Attachment: (was: NodeLabel.html) > Add Documentation for Distributed and Delegated-Centralized Node Labels > feature > --- > > Key: YARN-4100 > URL: https://issues.apache.org/jira/browse/YARN-4100 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: YARN-4100.v1.001.patch, YARN-4100.v1.002.patch, > YARN-4100.v1.003.patch > > > Add Documentation for Distributed Node Labels feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4098) Document ApplicationPriority feature
[ https://issues.apache.org/jira/browse/YARN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067521#comment-15067521 ] Hadoop QA commented on YARN-4098: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 57s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778978/0001-YARN-4098.patch | | JIRA Issue | YARN-4098 | | Optional Tests | asflicense mvnsite | | uname | Linux 47188a197ee4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 114b590 | | whitespace | https://builds.apache.org/job/PreCommit-YARN-Build/10068/artifact/patchprocess/whitespace-eol.txt | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Max memory used | 30MB | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10068/console | This message was automatically generated. > Document ApplicationPriority feature > > > Key: YARN-4098 > URL: https://issues.apache.org/jira/browse/YARN-4098 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4098.patch, 0001-YARN-4098.patch, YARN-4098.rar > > > This JIRA is to track documentation of application priority and its user, > admin and REST interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4098) Document ApplicationPriority feature
[ https://issues.apache.org/jira/browse/YARN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067514#comment-15067514 ] Rohith Sharma K S commented on YARN-4098: - Updated the documentation patch adding client interfaces. Attached the "mvn site" to see the content in html page. [~sunilg]/[~jianhe] kindly review documentation patch > Document ApplicationPriority feature > > > Key: YARN-4098 > URL: https://issues.apache.org/jira/browse/YARN-4098 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4098.patch, 0001-YARN-4098.patch, YARN-4098.rar > > > This JIRA is to track documentation of application priority and its user, > admin and REST interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4098) Document ApplicationPriority feature
[ https://issues.apache.org/jira/browse/YARN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4098: Attachment: YARN-4098.rar > Document ApplicationPriority feature > > > Key: YARN-4098 > URL: https://issues.apache.org/jira/browse/YARN-4098 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4098.patch, 0001-YARN-4098.patch, YARN-4098.rar > > > This JIRA is to track documentation of application priority and its user, > admin and REST interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4098) Document ApplicationPriority feature
[ https://issues.apache.org/jira/browse/YARN-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith Sharma K S updated YARN-4098: Attachment: 0001-YARN-4098.patch > Document ApplicationPriority feature > > > Key: YARN-4098 > URL: https://issues.apache.org/jira/browse/YARN-4098 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4098.patch, 0001-YARN-4098.patch, YARN-4098.rar > > > This JIRA is to track documentation of application priority and its user, > admin and REST interfaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4195) Support of node-labels in the ReservationSystem "Plan"
[ https://issues.apache.org/jira/browse/YARN-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067503#comment-15067503 ] Carlo Curino commented on YARN-4195: [~leftnoteasy], good questions: # We can honor the request in best-effort mode. We can't guarantee we will find nodes that match, but if one is found (and not promised to other) we can give it to the user. # The assignment is independent of capacity promises, so goes through normally (am I missing something in this question?). The reservation-system has mechanisms to react to changes in amount of capacity under each partition/label, handling the case of a dynamically changing label _after_ a job grabbed the container is out-of-scope. The combinatorial explosion is even worse, I think is the powerset of labels, so {{|partitions| = 2^N}}, _where N is the number of base labels_. The good news is that we do not automatically explode all combinations, but only focus on *"active" partitions*, i.e., unique combinations of labels that are associated with least one node. This means that we have a hard upper-bound {{|partitions| <= K}} where _K is the number of nodes in the cluster_. We should run some tests, but it is possible that YARN-4476 's algos are efficient enough to deal with this even for large clusters (e.g., {{K=5000}}). {{|partitions| = K}} would be the norm if we decide to unify the notion of labels with the one of locality (I.e., a machine name is nothing but a label, and so is the rack). If however, we do not converge node-labels and locality, I would expect that in most clusters, we could find groups of nodes which are fungible, i.e., they form an equivalence-class (w.r.t. the set of labels they share) of size >1. This is like saying that those nodes are "indistinguishable" for the user (this used to be true for all nodes in a cluster bar locality). In some of our data centers we saw unique partitions in the order of {{K/100 < |partitions| < K/20}}---for the set of labels we cared about in those settings. (Mileage will heavily vary based on the semantics/use of labels you consider). > Support of node-labels in the ReservationSystem "Plan" > -- > > Key: YARN-4195 > URL: https://issues.apache.org/jira/browse/YARN-4195 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4195.patch > > > As part of YARN-4193 we need to enhance the InMemoryPlan (and related > classes) to track the per-label available resources, as well as the per-label > reservation-allocations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067480#comment-15067480 ] Jun Gong commented on YARN-3480: Thanks for explaining. These cases make removing attempts complex. We are removing attempts asynchronously. If RMStateStore does not transit to 'FENCED' for failed operations, we might fail to remove some attempts and succeed to remove other attempts, suppose there were 4 attempts: attempt01, attempt02, attempt03 and attempt04, we wanted to remove 2 attempts(attempt01 and attempt02), but we failed to remove attempt01, then remain attempts are attempt01, attempt03 and attempt04. They are not consistent. When recovering these attempts for RM restart, we will fail to recover attempts because we could not recover attempt02. To make things simple, how about just remove attempts if HA is enabled(or 'RMFailFast' is set)? > Recovery may get very slow with lots of services with lots of app-attempts > -- > > Key: YARN-3480 > URL: https://issues.apache.org/jira/browse/YARN-3480 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jun Gong >Assignee: Jun Gong > Attachments: YARN-3480.01.patch, YARN-3480.02.patch, > YARN-3480.03.patch, YARN-3480.04.patch, YARN-3480.05.patch, > YARN-3480.06.patch, YARN-3480.07.patch, YARN-3480.08.patch, > YARN-3480.09.patch, YARN-3480.10.patch, YARN-3480.11.patch > > > When RM HA is enabled and running containers are kept across attempts, apps > are more likely to finish successfully with more retries(attempts), so it > will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However > it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make > RM recover process much slower. It might be better to set max attempts to be > stored in RMStateStore. > BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to > a small value, retried attempts might be very large. So we need to delete > some attempts stored in RMStateStore and RMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4290) Add -showDetails option to YARN Nodes CLI to print all nodes reports information
[ https://issues.apache.org/jira/browse/YARN-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067471#comment-15067471 ] Sunil G commented on YARN-4290: --- Thank you [~leftnoteasy] for the review and commit, and thank you Naga for the comments.!! > Add -showDetails option to YARN Nodes CLI to print all nodes reports > information > > > Key: YARN-4290 > URL: https://issues.apache.org/jira/browse/YARN-4290 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Reporter: Wangda Tan >Assignee: Sunil G > Fix For: 2.8.0 > > Attachments: 0002-YARN-4290.patch, 0003-YARN-4290.patch > > > Currently, "yarn nodes -list" command only shows > - "Node-Id", > - "Node-State", > - "Node-Http-Address", > - "Number-of-Running-Containers" > I think we need to show more information such as used resource, just like > "yarn nodes -status" command. > Maybe we can add a parameter to -list, such as "-show-details" to enable > printing all detailed information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4462) FairScheduler: Disallow preemption from a queue
[ https://issues.apache.org/jira/browse/YARN-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067466#comment-15067466 ] Tao Jie commented on YARN-4462: --- In the attached patch, we add disable preemption tag for queues in fair-scheduler.xml. eg: {code} 1024mb,0vcores {code} With the disablePreemption tag, this queue and its descendants could not be preempted. [~kasha], would you give it a quick review? > FairScheduler: Disallow preemption from a queue > --- > > Key: YARN-4462 > URL: https://issues.apache.org/jira/browse/YARN-4462 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.6.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-4462.001.patch > > > When scheduler preemption is enabled, applications could be preempted if they > obtain resource over they should take. > When a mapreduce application is preempted some resource, it just runs slower. > However, when the preempted application is a long-run service, such as tomcat > running in slider, the service would fail. > So we should have a flag for application to indicate the scheduler that those > application should not be preempted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4462) FairScheduler: Disallow preemption from a queue
[ https://issues.apache.org/jira/browse/YARN-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie updated YARN-4462: -- Attachment: YARN-4462.001.patch > FairScheduler: Disallow preemption from a queue > --- > > Key: YARN-4462 > URL: https://issues.apache.org/jira/browse/YARN-4462 > Project: Hadoop YARN > Issue Type: Improvement > Components: fairscheduler >Affects Versions: 2.6.0 >Reporter: Tao Jie >Assignee: Tao Jie > Attachments: YARN-4462.001.patch > > > When scheduler preemption is enabled, applications could be preempted if they > obtain resource over they should take. > When a mapreduce application is preempted some resource, it just runs slower. > However, when the preempted application is a long-run service, such as tomcat > running in slider, the service would fail. > So we should have a flag for application to indicate the scheduler that those > application should not be preempted. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled
[ https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067444#comment-15067444 ] Bibin A Chundatt commented on YARN-4465: [~leftnoteasy]/[~sunilg] Hope only the above 2 changes are required. Will update the patch soon. > SchedulerUtils#validateRequest for Label check should happen only when > nodelabel enabled > > > Key: YARN-4465 > URL: https://issues.apache.org/jira/browse/YARN-4465 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt > > Disable label from rm side yarn.nodelabel.enable=false > Capacity scheduler label configuration for queue is available as below > default label for queue = b1 as 3 and accessible labels as 1,3 > Submit application to queue A . > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): > Invalid resource request, queue=b1 doesn't have permission to access all > labels in resource request. labelExpression of resource request=3. Queue > labels=1,3 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:216) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:401) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:602) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:247) > {noformat} > # Ignore default label expression when label is disabled *or* > # NormalizeResourceRequest we can set label expression to > when node label is not enabled *or* > # Improve message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067436#comment-15067436 ] Hadoop QA commented on YARN-3367: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m 9s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 30s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 33s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 43s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 45s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 28s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 32s {color} | {color:red} Patch generated 13 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 49, now 59). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 34s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common introduced 4 new FindBugs issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 49s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 56s {color} | {color:red} hadoop-yarn-server-nodemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 14s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} unit {color
[jira] [Commented] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067394#comment-15067394 ] Hadoop QA commented on YARN-4234: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 25s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 46s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 41s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 32s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 59s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 32s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 32s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 29s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 237, now 237). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 33s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 3s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 12s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 17s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 25s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 16s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s {color} | {color:green} Patch does not g
[jira] [Commented] (YARN-3458) CPU resource monitoring in Windows
[ https://issues.apache.org/jira/browse/YARN-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067397#comment-15067397 ] Hudson commented on YARN-3458: -- FAILURE: Integrated in Hadoop-trunk-Commit #9010 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9010/]) YARN-3458. CPU resource monitoring in Windows. Contributed by Inigo (cnauroth: rev 114b59095540bb80db5153c816f9d285e4029031) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/WindowsBasedProcessTree.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/util/TestWindowsBasedProcessTree.java * hadoop-yarn-project/CHANGES.txt > CPU resource monitoring in Windows > -- > > Key: YARN-3458 > URL: https://issues.apache.org/jira/browse/YARN-3458 > Project: Hadoop YARN > Issue Type: New Feature > Components: nodemanager >Affects Versions: 2.7.0 > Environment: Windows >Reporter: Inigo Goiri >Assignee: Inigo Goiri >Priority: Minor > Labels: BB2015-05-TBR, containers, metrics, windows > Fix For: 2.8.0 > > Attachments: YARN-3458-1.patch, YARN-3458-2.patch, YARN-3458-3.patch, > YARN-3458-4.patch, YARN-3458-5.patch, YARN-3458-6.patch, YARN-3458-7.patch, > YARN-3458-8.patch, YARN-3458-9.patch > > Original Estimate: 168h > Remaining Estimate: 168h > > The current implementation of getCpuUsagePercent() for > WindowsBasedProcessTree is left as unavailable. Attached a proposal of how to > do it. I reused the CpuTimeTracker using 1 jiffy=1ms. > This was left open by YARN-3122. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067385#comment-15067385 ] Hadoop QA commented on YARN-4234: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 8s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 48s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 16s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 28s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 237, now 237). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 46s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 5s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 44s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 57s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not gene
[jira] [Commented] (YARN-3586) RM only get back addresses of Collectors that NM needs to know.
[ https://issues.apache.org/jira/browse/YARN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067372#comment-15067372 ] Sangjin Lee commented on YARN-3586: --- The latest patch LGTM. Thanks [~djp]! > RM only get back addresses of Collectors that NM needs to know. > --- > > Key: YARN-3586 > URL: https://issues.apache.org/jira/browse/YARN-3586 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, timelineserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-3586-demo.patch, YARN-3586-feature-YARN-2928.patch, > YARN-3586-feature-YARN-2928.v2.patch > > > After YARN-3445, RM cache runningApps for each NM. So RM heartbeat back to NM > should only include collectors' address for running applications against > specific NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3367) Replace starting a separate thread for post entity with event loop in TimelineClient
[ https://issues.apache.org/jira/browse/YARN-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3367: Attachment: YARN-3367-feature-YARN-2928.003.patch [~djp], [~sjlee0] & [~gtCarrera], Attached a wip patch , i have considered 2 things here : 1. order of sync and async events are guranteed 2. Merging of async events and then pushing it to server at one call. (so that the load is reduced). Few points : * I could reuse/extend Async Dispatcher after YARN-4400 is committed to trunk. * i think it can be more organized if i can move the all this related code(dispatcher code) to a new class. * will work on other locations(removing the thread pools in the caller side) once the approach is finalized > Replace starting a separate thread for post entity with event loop in > TimelineClient > > > Key: YARN-3367 > URL: https://issues.apache.org/jira/browse/YARN-3367 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Junping Du >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-3367-feature-YARN-2928.003.patch, > YARN-3367-feature-YARN-2928.v1.002.patch, YARN-3367.YARN-2928.001.patch > > > Since YARN-3039, we add loop in TimelineClient to wait for > collectorServiceAddress ready before posting any entity. In consumer of > TimelineClient (like AM), we are starting a new thread for each call to get > rid of potential deadlock in main thread. This way has at least 3 major > defects: > 1. The consumer need some additional code to wrap a thread before calling > putEntities() in TimelineClient. > 2. It cost many thread resources which is unnecessary. > 3. The sequence of events could be out of order because each posting > operation thread get out of waiting loop randomly. > We should have something like event loop in TimelineClient side, > putEntities() only put related entities into a queue of entities and a > separated thread handle to deliver entities in queue to collector via REST > call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067352#comment-15067352 ] Junping Du commented on YARN-4234: -- bq. Once this patch is committed, would anyone of you please ping me so that I can quickly rebase YARN-4265? I think you will receive notification automatically if you watch on this JIRA. > New put APIs in TimelineClient for ats v1.5 > --- > > Key: YARN-4234 > URL: https://issues.apache.org/jira/browse/YARN-4234 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4234-2015-11-13.1.patch, > YARN-4234-2015-11-16.1.patch, YARN-4234-2015-11-16.2.patch, > YARN-4234-2015.2.patch, YARN-4234.1.patch, YARN-4234.2.patch, > YARN-4234.2015-11-12.1.patch, YARN-4234.2015-11-12.1.patch, > YARN-4234.2015-11-18.1.patch, YARN-4234.2015-11-18.2.patch, > YARN-4234.2015-11-18.patch, YARN-4234.2015-12-09.patch, > YARN-4234.2015-12-09.patch, YARN-4234.2015-12-17.1.patch, > YARN-4234.2015-12-18.1.patch, YARN-4234.2015-12-18.patch, > YARN-4234.2015-12-21.1.patch, YARN-4234.20151109.patch, > YARN-4234.20151110.1.patch, YARN-4234.2015.1.patch, YARN-4234.3.patch > > > In this ticket, we will add new put APIs in timelineClient to let > clients/applications have the option to use ATS v1.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067347#comment-15067347 ] Bibin A Chundatt commented on YARN-4454: [~leftnoteasy] Thank you for review and commit. > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: > [,] > 2015-12-14 17:17:54,905 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: REPLACE labels on > nodes: > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:64318, labels=[ResourcePool_1] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:0, labels=[ResourcePool_null] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-187:64318, labels=[ResourcePool_null] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067340#comment-15067340 ] Sangjin Lee commented on YARN-3816: --- Right. There was a discussion around YARN-4053 on the column names for metrics, because we felt that there were multiple cases that may require encoding more information into the metric column names. The "toAggregate" flag was one of them. But depending on how we do this, it can make things like filtering tricky. Furthermore, if we have to add multiple dimensions to the column names, then we need to be REAL careful to do it in a manner that doesn't destroy usability or performance. You might want to check out the comments Varun referenced. At that time, we said we should explore ways to handle the information whether to aggregate certain metrics outside the HBase column names (e.g. separate configuration or properties, etc.). We can discuss this more here. > [Aggregation] App-level aggregation and accumulation for YARN system metrics > > > Key: YARN-3816 > URL: https://issues.apache.org/jira/browse/YARN-3816 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Labels: yarn-2928-1st-milestone > Attachments: Application Level Aggregation of Timeline Data.pdf, > YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, > YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, > YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, > YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, > YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, > YARN-3816-poc-v2.patch > > > We need application level aggregation of Timeline data: > - To present end user aggregated states for each application, include: > resource (CPU, Memory) consumption across all containers, number of > containers launched/completed/failed, etc. We need this for apps while they > are running as well as when they are done. > - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be > aggregated to show details of states in framework level. > - Other level (Flow/User/Queue) aggregation can be more efficient to be based > on Application-level aggregations rather than raw entity-level data as much > less raws need to scan (with filter out non-aggregated entities, like: > events, configurations, etc.). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067337#comment-15067337 ] Hadoop QA commented on YARN-4492: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 6s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778943/YARN-4492.v1.003.patch | | JIRA Issue | YARN-4492 | | Optional Tests | asflicense mvnsite | | uname | Linux 0be32395cbd3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a0de702 | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Max memory used | 29MB | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10066/console | This message was automatically generated. > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4470) Application Master in-place upgrade
[ https://issues.apache.org/jira/browse/YARN-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067338#comment-15067338 ] Marco Rabozzi commented on YARN-4470: - Thanks [~ste...@apache.com] for your comments and for the time in reviewing our proposal. I and [~giovanni.fumarola] are going to evaluate SLIDER and get back to you. > Application Master in-place upgrade > --- > > Key: YARN-4470 > URL: https://issues.apache.org/jira/browse/YARN-4470 > Project: Hadoop YARN > Issue Type: New Feature > Components: resourcemanager >Reporter: Giovanni Matteo Fumarola >Assignee: Giovanni Matteo Fumarola > Attachments: AM in-place upgrade design. rev1.pdf > > > It would be nice if clients could ask for an AM in-place upgrade. > It will give to YARN the possibility to upgrade the AM, without losing the > work > done within its containers. This allows to deploy bug-fixes and new versions > of the AM incurring in long service downtimes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067330#comment-15067330 ] Sangjin Lee commented on YARN-3816: --- Regarding the new entity type "YARN_APPLICATION_AGGREGATION", I think I have raised this topic before and so have others, but at least I cannot find answers in this JIRA. Is there a strong reason to introduce a separate entity type just for this purpose, rather than reusing the existing YARN_APPLICATION type (and the application table)? If so, could you elaborate on why? This would create a complete separation of any normal metrics that may be stored in the application table and these aggregated metrics handled in this JIRA. It has a number of implications. First, if you query normally for applications, the aggregated metrics would *not* be included in the reader queries (I guess that's why a separate REST end point was introduced?). Furthermore, the current app-to-flow-run aggregation looks only at the application table, and the aggregated metrics in this manner would *not* be rolled up to the flow run, flow, and so on unless we make an explicit change to look at the entity table with that entity type. Making that change also sounds like a very much a non-trivial change (cc [~vrushalic]). > [Aggregation] App-level aggregation and accumulation for YARN system metrics > > > Key: YARN-3816 > URL: https://issues.apache.org/jira/browse/YARN-3816 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Labels: yarn-2928-1st-milestone > Attachments: Application Level Aggregation of Timeline Data.pdf, > YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, > YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, > YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, > YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, > YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, > YARN-3816-poc-v2.patch > > > We need application level aggregation of Timeline data: > - To present end user aggregated states for each application, include: > resource (CPU, Memory) consumption across all containers, number of > containers launched/completed/failed, etc. We need this for apps while they > are running as well as when they are done. > - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be > aggregated to show details of states in framework level. > - Other level (Flow/User/Queue) aggregation can be more efficient to be based > on Application-level aggregations rather than raw entity-level data as much > less raws need to scan (with filter out non-aggregated entities, like: > events, configurations, etc.). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067327#comment-15067327 ] Li Lu commented on YARN-4234: - Thanks for the work [~xgong] and [~djp]! Once this patch is committed, would anyone of you please ping me so that I can quickly rebase YARN-4265? > New put APIs in TimelineClient for ats v1.5 > --- > > Key: YARN-4234 > URL: https://issues.apache.org/jira/browse/YARN-4234 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4234-2015-11-13.1.patch, > YARN-4234-2015-11-16.1.patch, YARN-4234-2015-11-16.2.patch, > YARN-4234-2015.2.patch, YARN-4234.1.patch, YARN-4234.2.patch, > YARN-4234.2015-11-12.1.patch, YARN-4234.2015-11-12.1.patch, > YARN-4234.2015-11-18.1.patch, YARN-4234.2015-11-18.2.patch, > YARN-4234.2015-11-18.patch, YARN-4234.2015-12-09.patch, > YARN-4234.2015-12-09.patch, YARN-4234.2015-12-17.1.patch, > YARN-4234.2015-12-18.1.patch, YARN-4234.2015-12-18.patch, > YARN-4234.2015-12-21.1.patch, YARN-4234.20151109.patch, > YARN-4234.20151110.1.patch, YARN-4234.2015.1.patch, YARN-4234.3.patch > > > In this ticket, we will add new put APIs in timelineClient to let > clients/applications have the option to use ATS v1.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: CapacityScheduler.html > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3816) [Aggregation] App-level aggregation and accumulation for YARN system metrics
[ https://issues.apache.org/jira/browse/YARN-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067321#comment-15067321 ] Sangjin Lee commented on YARN-3816: --- It seems the latest patch (v.4.1) is mostly a rebase change, so I'll wait for an updated patch that addresses the comments. To comment on some of the questions and comments, {quote} That sounds a reasonable concern here. I agree that we should get rid of metrics get messed up between system metrics and application's metrics. However, I think our goal here is not just aggregate/accumulate container metrics, but also provide aggregation service to applications' metrics (other than MR). Isn't it? If so, may be a better way is to aggregate metrcis along not only metric name but also its original entity type (so memory metrics for ContainerEntity won't be aggregated against memory metrics from Application Entity). Sangjin Lee, What do you think? {quote} If I understood your suggestion correctly, you're talking about qualifying (or scoping) the metric with the entity type so that they don't get mixed up, right? I still see that this can be problematic. Let me illustrate an example. Suppose there is an app framework called "Foo". Let's suppose Foo has a notion of "jobs" (entity type = "FooJob"), "tasks" (entity type = "FooTask") and "subtasks" (entity type = "FooSubTask"), so that a job is made up of a bunch of tasks, and each task can be made up of subtasks. Furthermore, suppose all of them emit metrics called "MEMORY" where the sum of all subtasks' memory is the same as the parent task's memory, and the sum of all tasks' memory is the same as the parent job's memory. With the idea of qualifying metrics with the entity type, still all these types will contribute MEMORY to aggregation (FooJob-to-application, FooTask-to-application, and FooSubTask-to-application), in addition to the YARN-generic container-to-application aggregation. But given their nature, things like FooSubTask-to-application and FooTask-to-application aggregation are very much redundant and thus wasteful. It's basically doing the same summation multiple times. As you suggested later, we could utilize the "toAggregate" flag for applications to exclude certain metrics from aggregation (in this case FOO would need to set toAggregate = false for all its types). But I think we need to determine how valuable it is to open this up to app-specific metrics. Also, if we were to qualify the metric names with the entity type, another complicating factor is the HBase column names for metrics. Now the aggregated metric names in the application table would need to be prefixed (or encoded in some form) with the entity type. We need to think about the implication of queries, filters, etc. To me, the most important thing we need to get right is the *YARN-generic container-to-application aggregation*. That needs to be correct and perform well in all cases. Supporting \*-to-application aggregation for app-specific metrics is somewhat secondary IMO. How about keeping it simple, and focusing on the container-to-application aggregation? > [Aggregation] App-level aggregation and accumulation for YARN system metrics > > > Key: YARN-3816 > URL: https://issues.apache.org/jira/browse/YARN-3816 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Junping Du >Assignee: Junping Du > Labels: yarn-2928-1st-milestone > Attachments: Application Level Aggregation of Timeline Data.pdf, > YARN-3816-YARN-2928-v1.patch, YARN-3816-YARN-2928-v2.1.patch, > YARN-3816-YARN-2928-v2.2.patch, YARN-3816-YARN-2928-v2.3.patch, > YARN-3816-YARN-2928-v2.patch, YARN-3816-YARN-2928-v3.1.patch, > YARN-3816-YARN-2928-v3.patch, YARN-3816-YARN-2928-v4.patch, > YARN-3816-feature-YARN-2928.v4.1.patch, YARN-3816-poc-v1.patch, > YARN-3816-poc-v2.patch > > > We need application level aggregation of Timeline data: > - To present end user aggregated states for each application, include: > resource (CPU, Memory) consumption across all containers, number of > containers launched/completed/failed, etc. We need this for apps while they > are running as well as when they are done. > - Also, framework specific metrics, e.g. HDFS_BYTES_READ, should be > aggregated to show details of states in framework level. > - Other level (Flow/User/Queue) aggregation can be more efficient to be based > on Application-level aggregations rather than raw entity-level data as much > less raws need to scan (with filter out non-aggregated entities, like: > events, configurations, etc.). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: (was: CapacityScheduler.html) > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: YARN-4492.v1.001.patch, YARN-4492.v1.002.patch, > YARN-4492.v1.003.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: YARN-4492.v1.003.patch updated with your last comment ! > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch, YARN-4492.v1.003.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuan Gong updated YARN-4234: Attachment: YARN-4234.2015-12-21.1.patch Uploaded a new patch to fix checkstyle warning > New put APIs in TimelineClient for ats v1.5 > --- > > Key: YARN-4234 > URL: https://issues.apache.org/jira/browse/YARN-4234 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4234-2015-11-13.1.patch, > YARN-4234-2015-11-16.1.patch, YARN-4234-2015-11-16.2.patch, > YARN-4234-2015.2.patch, YARN-4234.1.patch, YARN-4234.2.patch, > YARN-4234.2015-11-12.1.patch, YARN-4234.2015-11-12.1.patch, > YARN-4234.2015-11-18.1.patch, YARN-4234.2015-11-18.2.patch, > YARN-4234.2015-11-18.patch, YARN-4234.2015-12-09.patch, > YARN-4234.2015-12-09.patch, YARN-4234.2015-12-17.1.patch, > YARN-4234.2015-12-18.1.patch, YARN-4234.2015-12-18.patch, > YARN-4234.2015-12-21.1.patch, YARN-4234.20151109.patch, > YARN-4234.20151110.1.patch, YARN-4234.2015.1.patch, YARN-4234.3.patch > > > In this ticket, we will add new put APIs in timelineClient to let > clients/applications have the option to use ATS v1.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4194) Extend Reservation Definition Langauge (RDL) extensions to support node labels
[ https://issues.apache.org/jira/browse/YARN-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067275#comment-15067275 ] Carlo Curino commented on YARN-4194: It was supposed to read {{+1}} from me. > Extend Reservation Definition Langauge (RDL) extensions to support node labels > -- > > Key: YARN-4194 > URL: https://issues.apache.org/jira/browse/YARN-4194 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Alexey Tumanov > Attachments: YARN-4194-v1.patch, YARN-4194-v2.patch > > > This JIRA tracks changes to the APIs to the reservation system to support > the expressivity of node-labels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4194) Extend Reservation Definition Langauge (RDL) extensions to support node labels
[ https://issues.apache.org/jira/browse/YARN-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067274#comment-15067274 ] Carlo Curino commented on YARN-4194: [~atumanov], makes sense. If I read you right, you mean that the validation would depend on YARN-4476 and anyway resides outside the scope of this patch. If so, and provided you address [~subru] doc request the patch is +1 from me. I can commit it as soon as you send the new version (+ holidays lag). > Extend Reservation Definition Langauge (RDL) extensions to support node labels > -- > > Key: YARN-4194 > URL: https://issues.apache.org/jira/browse/YARN-4194 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Carlo Curino >Assignee: Alexey Tumanov > Attachments: YARN-4194-v1.patch, YARN-4194-v2.patch > > > This JIRA tracks changes to the APIs to the reservation system to support > the expressivity of node-labels. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4003) ReservationQueue inherit getAMResourceLimit() from LeafQueue, but behavior is not consistent
[ https://issues.apache.org/jira/browse/YARN-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067269#comment-15067269 ] Carlo Curino commented on YARN-4003: [~sunilg], I think what you propose make is possible, but the semantics would be a bit unpleasant. Assume a reservation queue R1 launches tons of AMs, and now another reservation queue R2 is stuck not being able to run any job. I wouldn't want that... I would rather have a reservation burning its entire capacity in AMs, but allow other reservation queues to launch their jobs. I think the cleaner solution (but definitely longer term) would be to treat the RM scheduling bandwidth as a separate (reservable) resource. So a queue (and similarly a reservation) can be configure to allow up to a certain amount of AMs (which in turn bounds how much RM scheduling bandwidth I am devoting to this queue). This would also makes lots of sense for the federation effort: YARN-2915 (where we need to partition jobs across sub-clusters to protect the RMs from excessive AM-RM traffic due to the scale-out nature of federation). What are folks generally thinking about explicitly capturing the cost of scheduler bandwidth (e.g., a service that launches 10 tasks and never asks for anything again is much less work for the RM than a MR jobs running many many short-lived tasks) ? > ReservationQueue inherit getAMResourceLimit() from LeafQueue, but behavior is > not consistent > > > Key: YARN-4003 > URL: https://issues.apache.org/jira/browse/YARN-4003 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Reporter: Carlo Curino >Assignee: Carlo Curino > Attachments: YARN-4003.patch > > > The inherited behavior from LeafQueue (limit AM % based on capacity) is not a > good fit for ReservationQueue (that have highly dynamic capacity). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067263#comment-15067263 ] Li Lu commented on YARN-4224: - Thanks for the work [~varun_saxena]! A few quick concerns: 1. I understand we would like to have plural forms for listing (like /flows, /apps) and singular forms for detail (like /flow/\{uid\}). But then, why do we need both /runs/\{uid\} and /run/\{uid\}? The same question also applies to apps. 2. For endpoints with UIDs, we need to work with flow, flowrun, app, and entity. I notice we have such code for flowrun (run) and app. For entities, we require both UID and type. Why type is not a part of UID (which means UID is not sufficient to identify an entity)? Or, are you planning to support operations like "list all entities in a given entity type"? If it is the latter, then do we want to consider put type into query parameters on end point entities? For flows, why we' re not including an UID endpoint to locate one flow? This poses a challenge when we'd like to list all flow runs within one flow (or, do we have any other end points to do this work? ). 3. Seems like there is no full path to locate one entity from the cluster, user, flow, run, app, entity type, and entity id. Are we omitting this endpoint deliberately? 4. As a side note, in this patch there are 3 types of "shortcuts" in the URL: omit the cluster id (with default cluster id), omit user id (with default user id) and directly access app id. I'm OK with direct accessing app ids (with cluster id), but do we want to omit the other two? Comments are more then welcome. bq. 3. We have 2 options. Either set UID in TimelineReaderManager or in the storage implementation . Advantage of former is that we are delinking UID implementation from backend storage implementation. Disadvantage is that we need to iterate over all the entities again to set UID. If we choose latter, it is the reverse. We can set UID while creating entities. But any new storage implementation needs to take care of filling UID then. I have as of now implemented the second option. Not yet added that UID needs to be filled in javadoc. bq. 4. Also UID is being returned as of now in both UID endpoint queries and non UID endpoint queries. Send UID only for former ? I'm also debating with myself on this. Right now I'm leaning towards to make the UIDs transparent to the storage layer. Since UIDs will be added as an info field, it's more like an attachment to the original entities, but not a part of them. This can also keep writers easy (enforcing writers to add some data to all written entities looks a little awkward? ). Thoughts? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067175#comment-15067175 ] Hadoop QA commented on YARN-3863: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 57s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 15s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 5s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 53s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 27m 35s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:7c86163 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778914/YARN-3863-feature-YARN-2928.wip.05.patch | | JIRA Issue | YARN-3863 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 97ce19b1b675 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_6
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067160#comment-15067160 ] Hadoop QA commented on YARN-4224: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 33s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 12s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 46s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 15s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 27s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 27s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 30s {color} | {color:red} Patch generated 6 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 72, now 77). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 22s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 43s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 46m 35s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:7c86163 | | JIRA Patch URL | https://is
[jira] [Commented] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067143#comment-15067143 ] Jian He commented on YARN-3480: --- Hi [~hex108], thanks for updating. bq. If RMStateStore fails to persist any attempt, it will transition to state 'RMStateStoreState.FENCED'. I think this is not true if HA is not enabled. If HA is not enabled and fail-fast is false, state-store will remain at ACTIVE state. below code in RMStateStore class {code} } else if (YarnConfiguration.shouldRMFailFast(getConfig())) { {code} > Recovery may get very slow with lots of services with lots of app-attempts > -- > > Key: YARN-3480 > URL: https://issues.apache.org/jira/browse/YARN-3480 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Jun Gong >Assignee: Jun Gong > Attachments: YARN-3480.01.patch, YARN-3480.02.patch, > YARN-3480.03.patch, YARN-3480.04.patch, YARN-3480.05.patch, > YARN-3480.06.patch, YARN-3480.07.patch, YARN-3480.08.patch, > YARN-3480.09.patch, YARN-3480.10.patch, YARN-3480.11.patch > > > When RM HA is enabled and running containers are kept across attempts, apps > are more likely to finish successfully with more retries(attempts), so it > will be better to set 'yarn.resourcemanager.am.max-attempts' larger. However > it will make RMStateStore(FileSystem/HDFS/ZK) store more attempts, and make > RM recover process much slower. It might be better to set max attempts to be > stored in RMStateStore. > BTW: When 'attemptFailuresValidityInterval'(introduced in YARN-611) is set to > a small value, retried attempts might be very large. So we need to delete > some attempts stored in RMStateStore and RMStateStore. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4476) Matcher for complex node label expresions
[ https://issues.apache.org/jira/browse/YARN-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067126#comment-15067126 ] Wangda Tan commented on YARN-4476: -- [~chris.douglas], bq. So I put in in the nodelabels package, but don't have a strong opinion. It sounds good to me. Thanks, > Matcher for complex node label expresions > - > > Key: YARN-4476 > URL: https://issues.apache.org/jira/browse/YARN-4476 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler >Reporter: Chris Douglas >Assignee: Chris Douglas > Attachments: YARN-4476-0.patch, YARN-4476-1.patch > > > Implementation of a matcher for complex node label expressions based on a > [paper|http://dl.acm.org/citation.cfm?id=1807171] from SIGMOD 2010. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3863: --- Attachment: YARN-3863-feature-YARN-2928.wip.05.patch > Enhance filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067091#comment-15067091 ] Hadoop QA commented on YARN-4304: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 24s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 33s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 14s {color} | {color:red} Patch generated 19 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager (total was 175, now 186). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 43s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 42s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 151m 23s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.scheduler.fifo.TestFifoScheduler | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL |
[jira] [Commented] (YARN-3586) RM only get back addresses of Collectors that NM needs to know.
[ https://issues.apache.org/jira/browse/YARN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067082#comment-15067082 ] Varun Saxena commented on YARN-3586: Thanks [~djp] for updating the patch. +1 on the latest patch. Will wait for a day before committing it in case others have comments. > RM only get back addresses of Collectors that NM needs to know. > --- > > Key: YARN-3586 > URL: https://issues.apache.org/jira/browse/YARN-3586 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, timelineserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-3586-demo.patch, YARN-3586-feature-YARN-2928.patch, > YARN-3586-feature-YARN-2928.v2.patch > > > After YARN-3445, RM cache runningApps for each NM. So RM heartbeat back to NM > should only include collectors' address for running applications against > specific NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3586) RM only get back addresses of Collectors that NM needs to know.
[ https://issues.apache.org/jira/browse/YARN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067079#comment-15067079 ] Hadoop QA commented on YARN-3586: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 53s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 30s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 19s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 27s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 36s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 17s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 32s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 143m 54s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | JDK v1.7.0_91 Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:7c86163 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778866/YARN-3586-feature-YARN-2928.v2.patch
[jira] [Comment Edited] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067070#comment-15067070 ] Wangda Tan edited comment on YARN-4100 at 12/21/15 9:15 PM: Thanks [~Naganarasimha] and reviews from [~dian.fu], Some suggestions: - {{configuring labels to nodes in Distributed/(Delegated-Centralized) NodeLabel setup}} should be a part of {{Setting up ResourceManager to enable Node Labels}}. - If you agree with {{Add/modify node-to-labels mapping to YARN}} should not include all the options. You can include an example like configuring ConfigurationNodeLabelsProvider to show how to get node-to-labels mapping to YARN. - {{configuring labels to ..}} -> {{Configuring...}} - {{yarn.node-labels.configuration-type | Set configuration type for node labels. Administrators can specify *"centralized"*, *"delegated-centralized"* or *"distributed"*. Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch the labels from a interface in RM.}}. "Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch.." This is a little confusing to me: default value should be "centralized" only, and explanation of "delegated-centralized" should in the {{Setting up ResourceManager to enable Node Labels}}. - {{**Distributed :** Mapping can be done through NM exposed interface *"NodeLabelsProvider"*...}}: This is too implementation-detailed to me, do you think is it better to say "Mapping will be set by configured NodeLabelsProvider" in NM, and then describe two different kinds of Providers - {{**Delegated-Centralized :** Mapping can be done through RM exposed interface *"RMNodeLabelsMappingProvider"*...}} like above, could we say: "Mapping will be set by configured NodeLabelsProvider" in RM. - For above two, {{Mapping}} -> {{Node-to-labels mapping}} - {{By default 2 implementations are supported:}} -> "We have two different providers in YARN: ..." Thoughts? was (Author: leftnoteasy): Thanks [~Naganarasimha] and reviews from [~dian.fu], some suggestions: Some suggestions: - {{configuring labels to nodes in Distributed/(Delegated-Centralized) NodeLabel setup}} should be a part of {{Setting up ResourceManager to enable Node Labels}}. - If you agree with {{Add/modify node-to-labels mapping to YARN}} should not include all the options. You can include an example like configuring ConfigurationNodeLabelsProvider to show how to get node-to-labels mapping to YARN. - {{configuring labels to ..}} -> {{Configuring...}} - {{yarn.node-labels.configuration-type | Set configuration type for node labels. Administrators can specify *"centralized"*, *"delegated-centralized"* or *"distributed"*. Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch the labels from a interface in RM.}}. "Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch.." This is a little confusing to me: default value should be "centralized" only, and explanation of "delegated-centralized" should in the {{Setting up ResourceManager to enable Node Labels}}. - {{**Distributed :** Mapping can be done through NM exposed interface *"NodeLabelsProvider"*...}}: This is too implementation-detailed to me, do you think is it better to say "Mapping will be set by configured NodeLabelsProvider" in NM, and then describe two different kinds of Providers - {{**Delegated-Centralized :** Mapping can be done through RM exposed interface *"RMNodeLabelsMappingProvider"*...}} like above, could we say: "Mapping will be set by configured NodeLabelsProvider" in RM. - For above two, {{Mapping}} -> {{Node-to-labels mapping}} - {{By default 2 implementations are supported:}} -> "We have two different providers in YARN: ..." > Add Documentation for Distributed and Delegated-Centralized Node Labels > feature > --- > > Key: YARN-4100 > URL: https://issues.apache.org/jira/browse/YARN-4100 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: NodeLabel.html, YARN-4100.v1.001.patch, > YARN-4100.v1.002.patch, YARN-4100.v1.003.patch > > > Add Documentation for Distributed Node Labels feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4100) Add Documentation for Distributed and Delegated-Centralized Node Labels feature
[ https://issues.apache.org/jira/browse/YARN-4100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067070#comment-15067070 ] Wangda Tan commented on YARN-4100: -- Thanks [~Naganarasimha] and reviews from [~dian.fu], some suggestions: Some suggestions: - {{configuring labels to nodes in Distributed/(Delegated-Centralized) NodeLabel setup}} should be a part of {{Setting up ResourceManager to enable Node Labels}}. - If you agree with {{Add/modify node-to-labels mapping to YARN}} should not include all the options. You can include an example like configuring ConfigurationNodeLabelsProvider to show how to get node-to-labels mapping to YARN. - {{configuring labels to ..}} -> {{Configuring...}} - {{yarn.node-labels.configuration-type | Set configuration type for node labels. Administrators can specify *"centralized"*, *"delegated-centralized"* or *"distributed"*. Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch the labels from a interface in RM.}}. "Default value is *"centralized"* and *"delegated-centralized"* needs to be set to fetch.." This is a little confusing to me: default value should be "centralized" only, and explanation of "delegated-centralized" should in the {{Setting up ResourceManager to enable Node Labels}}. - {{**Distributed :** Mapping can be done through NM exposed interface *"NodeLabelsProvider"*...}}: This is too implementation-detailed to me, do you think is it better to say "Mapping will be set by configured NodeLabelsProvider" in NM, and then describe two different kinds of Providers - {{**Delegated-Centralized :** Mapping can be done through RM exposed interface *"RMNodeLabelsMappingProvider"*...}} like above, could we say: "Mapping will be set by configured NodeLabelsProvider" in RM. - For above two, {{Mapping}} -> {{Node-to-labels mapping}} - {{By default 2 implementations are supported:}} -> "We have two different providers in YARN: ..." > Add Documentation for Distributed and Delegated-Centralized Node Labels > feature > --- > > Key: YARN-4100 > URL: https://issues.apache.org/jira/browse/YARN-4100 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R > Attachments: NodeLabel.html, YARN-4100.v1.001.patch, > YARN-4100.v1.002.patch, YARN-4100.v1.003.patch > > > Add Documentation for Distributed Node Labels feature -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067056#comment-15067056 ] Varun Saxena commented on YARN-4224: Updated a WIP patch. Few points. Would like to get views on them so that we can reach agreement. 1. UID is filled in info even if fields do not indicate INFO has to be returned. Moreover, the key is fixed as "UID". Should we make it configurable or documenting it would be enough ? User should not send any info key as UID then. 2. Same goes for UID delimiter ? Should it be configurable ? 3. We have 2 options. Either set UID in TimelineReaderManager or in the storage implementation . Advantage of former is that we are delinking UID implementation from backend storage implementation. Disadvantage is that we need to iterate over all the entities again to set UID. If we choose latter, it is the reverse. We can set UID while creating entities. But any new storage implementation needs to take care of filling UID then. I have as of now implemented the second option. Not yet added that UID needs to be filled in javadoc. 4. Also UID is being returned as of now in both UID endpoint queries and non UID endpoint queries. Send UID only for former ? > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4224: --- Attachment: YARN-4224-feature-YARN-2928.wip.02.patch > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch, > YARN-4224-feature-YARN-2928.wip.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4290) Add -showDetails option to YARN Nodes CLI to print all nodes reports information
[ https://issues.apache.org/jira/browse/YARN-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066986#comment-15066986 ] Hudson commented on YARN-4290: -- FAILURE: Integrated in Hadoop-trunk-Commit #9009 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9009/]) YARN-4290. Add -showDetails option to YARN Nodes CLI to print all nodes (wangda: rev a0de7028515eebe1c526cc42808cdbc8ed6b4e2a) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/cli/TestYarnCLI.java * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/NodeCLI.java > Add -showDetails option to YARN Nodes CLI to print all nodes reports > information > > > Key: YARN-4290 > URL: https://issues.apache.org/jira/browse/YARN-4290 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Reporter: Wangda Tan >Assignee: Sunil G > Fix For: 2.8.0 > > Attachments: 0002-YARN-4290.patch, 0003-YARN-4290.patch > > > Currently, "yarn nodes -list" command only shows > - "Node-Id", > - "Node-State", > - "Node-Http-Address", > - "Number-of-Running-Containers" > I think we need to show more information such as used resource, just like > "yarn nodes -status" command. > Maybe we can add a parameter to -list, such as "-show-details" to enable > printing all detailed information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066987#comment-15066987 ] Hudson commented on YARN-4454: -- FAILURE: Integrated in Hadoop-trunk-Commit #9009 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9009/]) YARN-4454. NM to nodelabel mapping going wrong after RM restart. (Bibin (wangda: rev bc038b382cb2ce561ce718405fbcee4382f3b204) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestRMRestart.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/nodelabels/CommonNodeLabelsManager.java > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Fix For: 2.8.0 > > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: > [,] > 2015-12-14 17:17:54,905 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: REPLACE labels on > nodes: > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:64318, labels=[ResourcePool_1] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:0, labels=[ResourcePool_null] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-187:64318, labels=[ResourcePool_null] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066962#comment-15066962 ] Hadoop QA commented on YARN-3863: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 33s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} feature-YARN-2928 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 11s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 36s {color} | {color:green} feature-YARN-2928 passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-timelineservice in feature-YARN-2928 failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} feature-YARN-2928 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 1s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.8.0_66 with JDK v1.8.0_66 generated 1 new issues (was 0, now 1). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 21s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 2m 22s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-jdk1.7.0_91 with JDK v1.7.0_91 generated 1 new issues (was 0, now 1). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 11s {color} | {color:red} Patch generated 1 new checkstyle issues in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice (total was 157, now 151). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 38s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 36s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_91. {color} | | {color:green}+1{color} | {col
[jira] [Commented] (YARN-4454) NM to nodelabel mapping going wrong after RM restart
[ https://issues.apache.org/jira/browse/YARN-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066945#comment-15066945 ] Wangda Tan commented on YARN-4454: -- Looks good, +1. thanks [~bibinchundatt]! Committing.. > NM to nodelabel mapping going wrong after RM restart > > > Key: YARN-4454 > URL: https://issues.apache.org/jira/browse/YARN-4454 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Critical > Attachments: 0001-YARN-4454.patch, test.patch > > > *Nodelabel mapping with NodeManager is going wrong if combination of > hostname and then NodeId is used to update nodelabel mapping* > *Steps to reproduce* > 1.Create cluster with 2 NM > 2.Add label X,Y to cluster > 3.replace Label of node 1 using ,x > 4.replace label for node 1 by ,y > 5.Again replace label of node 1 by ,x > Check cluster label mapping HOSTNAME1 will be mapped with X > Now restart RM 2 times NODE LABEL mapping of HOSTNAME1:PORT changes to Y > {noformat} > 2015-12-14 17:17:54,901 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: Add labels: > [,] > 2015-12-14 17:17:54,905 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: REPLACE labels on > nodes: > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:64318, labels=[ResourcePool_1] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-188:0, labels=[ResourcePool_null] > 2015-12-14 17:17:54,906 INFO > org.apache.hadoop.yarn.nodelabels.CommonNodeLabelsManager: > NM=host-10-19-92-187:64318, labels=[ResourcePool_null] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066955#comment-15066955 ] Carlo Curino commented on YARN-4468: Second attachment, includes a new ReservationSystem.md description, and links from the YARN.md page. Still missing documentation in CapacityScheduler.md and FairScheduler.md for how to configure the reservation system. > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino > Attachments: YARN-4468.1.patch, YARN-4468.rest-only.patch > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4468) Document the general ReservationSystem functionality, and the REST API
[ https://issues.apache.org/jira/browse/YARN-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carlo Curino updated YARN-4468: --- Attachment: YARN-4468.1.patch > Document the general ReservationSystem functionality, and the REST API > -- > > Key: YARN-4468 > URL: https://issues.apache.org/jira/browse/YARN-4468 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacityscheduler, fairscheduler, resourcemanager >Reporter: Carlo Curino > Attachments: YARN-4468.1.patch, YARN-4468.rest-only.patch > > > This JIRA tracks effort to document the ReservationSystem functionality, and > the REST API access to it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4290) Add -showDetails option to YARN Nodes CLI to print all nodes reports information
[ https://issues.apache.org/jira/browse/YARN-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wangda Tan updated YARN-4290: - Summary: Add -showDetails option to YARN Nodes CLI to print all nodes reports information (was: "yarn nodes -list" should print all nodes reports information) > Add -showDetails option to YARN Nodes CLI to print all nodes reports > information > > > Key: YARN-4290 > URL: https://issues.apache.org/jira/browse/YARN-4290 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Reporter: Wangda Tan >Assignee: Sunil G > Attachments: 0002-YARN-4290.patch, 0003-YARN-4290.patch > > > Currently, "yarn nodes -list" command only shows > - "Node-Id", > - "Node-State", > - "Node-Http-Address", > - "Number-of-Running-Containers" > I think we need to show more information such as used resource, just like > "yarn nodes -status" command. > Maybe we can add a parameter to -list, such as "-show-details" to enable > printing all detailed information. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol
[ https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066946#comment-15066946 ] Giovanni Matteo Fumarola commented on YARN-110: --- [~kasha] My idea is to keep a list of requested containers in AppSchedulingInfo. When the RM sends containers to the AM and in the same heartbeat the AM asks containers, the adding check forwards to the capacity scheduler the correct number of containers. After the vacation I will rebase my patch and I will push it. > AM releases too many containers due to the protocol > --- > > Key: YARN-110 > URL: https://issues.apache.org/jira/browse/YARN-110 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager, scheduler >Reporter: Arun C Murthy >Assignee: Arun C Murthy > Attachments: YARN-110.patch > > > - AM sends request asking 4 containers on host H1. > - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at > this point, sets the value against H1 to > zero in its aggregate request-table for all apps. > - In the mean-while AM gets to need 3 more containers, so a total of 7 > including the 4 from previous request. > - Today, AM sends the absolute number of 7 against H1 to RM as part of its > request table. > - RM seems to be overriding its earlier value of zero against H1 to 7 against > H1. And thus allocating 7 more > containers. > - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of > 11 instead of the required 7. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4224) Change the ATSv2 reader side REST interface to conform to current REST APIs' in YARN
[ https://issues.apache.org/jira/browse/YARN-4224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066917#comment-15066917 ] Li Lu commented on YARN-4224: - bq. We will pass optional parameters as part of query param. Correct ? I think that's the case. [~sjlee0] did you mean providing shortcuts to thing like applications (instead of cluster, user, flow, flowrun, app, we can directly have cluster and app)? bq. I am frankly fine with making the delimiter and how we construct UIDs' public. I'm also fine with it after putting some thoughts. It looks inevitable since we need to expose the ways we form the UIDs to the users anyways. Since we're reaching agreements on most of the important factors of this issue, maybe we can kick off the work on this JIRA? Thanks! > Change the ATSv2 reader side REST interface to conform to current REST APIs' > in YARN > > > Key: YARN-4224 > URL: https://issues.apache.org/jira/browse/YARN-4224 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-4224-YARN-2928.01.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066910#comment-15066910 ] Varun Saxena commented on YARN-3863: Updated another WIP patch with filters for created and modified time. > Enhance filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066912#comment-15066912 ] Wangda Tan commented on YARN-4304: -- Thanks for update [~sunilg], Some minor comments: 1) CapacitySchedulerPage: - You can fetch {{lqinfo.getUsers().getUsersList()}} only once. - {{(resourceUsages.getAmUsed() == null) ? "N/A"}}, is it better to use Resource.None() instead of N/A? 2) LeafQueue: - I'm not sure if this is required: {code} public synchronized Resource getAMResourceLimit() { // Ensure we calculate limit when its not pre-computed if (queueUsage.getAMLimit().equals(Resources.none())) { {code} Since calculateAndGetAMResourceLimit is called by activateApplications, and activateApplications is called by updateClusterResource. It will be updated when cluster resource changed or queue configuration changed (initialized). I think the getAMResourceLimit should safely return queueUsage.getAMLimit directly. - getAMResourceLimit doesn't need synchronized lock - getUserAMResourceLimit is used by Tests and REST API only. I think REST API can use Resource from UsersInfo and AMResourceLimit, no need to access queue's synchronized lock. And I think you can move following code to CapacitySchedulerLeafQueueInfo: {code} 140 // Get UserInfo from first user to calculate AM Resource Limit per user. 141 ResourceInfo userAMResourceLimit = null; 142 if (lqinfo.getUsers().getUsersList().isEmpty()) { 143 // If no users are present, consider AM Limit for that queue. 144 userAMResourceLimit = resourceUsages.getAMResourceLimit(); 145 } else { 146 userAMResourceLimit = lqinfo.getUsers().getUsersList().get(0) 147 .getResourceUsageInfo().getPartitionResourceUsageInfo(label) 148 .getAMResourceLimit(); 149 } {code} > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3863) Enhance filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3863: --- Attachment: YARN-3863-feature-YARN-2928.wip.04.patch > Enhance filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066887#comment-15066887 ] Hadoop QA commented on YARN-4492: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 0m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12778872/YARN-4492.v1.002.patch | | JIRA Issue | YARN-4492 | | Optional Tests | asflicense mvnsite | | uname | Linux a443b8cfe42f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2cb5aff | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site | | Max memory used | 29MB | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10060/console | This message was automatically generated. > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2882) Add ExecutionType to denote if a container execution is GUARANTEED or OPPORTUNISTIC
[ https://issues.apache.org/jira/browse/YARN-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066850#comment-15066850 ] Wangda Tan commented on YARN-2882: -- [~kasha], [~asuresh]. For the summary you posted: https://issues.apache.org/jira/browse/YARN-2882?focusedCommentId=15065547&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15065547. I agree with removing ExecutionType from client API, and also: bq. Additional policies on execution - queueable or over-subscription - is determined by the node's configuration. YARN-2877 would add the queueable flag and logic. YARN-1011 would add the over-subscription flag and logic. This logic may include having to monitor the usage of the node. However, I think we still need a flag in ResourceRequest to describe if it's a "opportunistic" container or not, correct? Otherwise RM/LocalRM cannot decide if it can take risk to allocate a queueable/oversubscribe container. Thoughts? > Add ExecutionType to denote if a container execution is GUARANTEED or > OPPORTUNISTIC > --- > > Key: YARN-2882 > URL: https://issues.apache.org/jira/browse/YARN-2882 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2882-yarn-2877.001.patch, > YARN-2882-yarn-2877.002.patch, YARN-2882-yarn-2877.003.patch, > YARN-2882-yarn-2877.004.patch, yarn-2882.patch > > > This JIRA introduces the notion of container types. > We propose two initial types of containers: guaranteed-start and queueable > containers. > Guaranteed-start are the existing containers, which are allocated by the > central RM and are instantaneously started, once allocated. > Queueable is a new type of container, which allows containers to be queued in > the NM, thus their execution may be arbitrarily delayed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: 0006-YARN-4304.patch Attaching new patch and screen shots. > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: REST_and_UI.zip > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch, 0006-YARN-4304.patch, REST_and_UI.zip > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4304) AM max resource configuration per partition to be displayed/updated correctly in UI and in various partition related metrics
[ https://issues.apache.org/jira/browse/YARN-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-4304: -- Attachment: (was: REST_and_UI.zip) > AM max resource configuration per partition to be displayed/updated correctly > in UI and in various partition related metrics > > > Key: YARN-4304 > URL: https://issues.apache.org/jira/browse/YARN-4304 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4304.patch, 0002-YARN-4304.patch, > 0003-YARN-4304.patch, 0004-YARN-4304.patch, 0005-YARN-4304.patch, > 0005-YARN-4304.patch > > > As we are supporting per-partition level max AM resource percentage > configuration, UI and various metrics also need to display correct > configurations related to same. > For eg: Current UI still shows am-resource percentage per queue level. This > is to be updated correctly when label config is used. > - Display max-am-percentage per-partition in Scheduler UI (label also) and in > ClusterMetrics page > - Update queue/partition related metrics w.r.t per-partition > am-resource-percentage -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066830#comment-15066830 ] Daniel Templeton commented on YARN-4492: Excellent. There was one more change in there that I think was missed. {{This will take impact only when system wide preemption is enabled}} should be {{This property applies only when system wide preemption is enabled}}. > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4492: Attachment: YARN-4492.v1.002.patch Thanks [~templedf], Yes i am fine your last suggestion, updating the patch with your review comments fixed > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch, > YARN-4492.v1.002.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-914) (Umbrella) Support graceful decommission of nodemanager
[ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-914: Issue Type: New Feature (was: Improvement) > (Umbrella) Support graceful decommission of nodemanager > --- > > Key: YARN-914 > URL: https://issues.apache.org/jira/browse/YARN-914 > Project: Hadoop YARN > Issue Type: New Feature > Components: graceful >Affects Versions: 2.0.4-alpha >Reporter: Luke Lu >Assignee: Junping Du > Attachments: Gracefully Decommission of NodeManager (v1).pdf, > Gracefully Decommission of NodeManager (v2).pdf, > GracefullyDecommissionofNodeManagerv3.pdf > > > When NMs are decommissioned for non-fault reasons (capacity change etc.), > it's desirable to minimize the impact to running applications. > Currently if a NM is decommissioned, all running containers on the NM need to > be rescheduled on other NMs. Further more, for finished map tasks, if their > map output are not fetched by the reducers of the job, these map tasks will > need to be rerun as well. > We propose to introduce a mechanism to optionally gracefully decommission a > node manager. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3586) RM only get back addresses of Collectors that NM needs to know.
[ https://issues.apache.org/jira/browse/YARN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated YARN-3586: - Attachment: YARN-3586-feature-YARN-2928.v2.patch Incorporate previous comments in v2 patch, also fix a whitespace issue. > RM only get back addresses of Collectors that NM needs to know. > --- > > Key: YARN-3586 > URL: https://issues.apache.org/jira/browse/YARN-3586 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, timelineserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-3586-demo.patch, YARN-3586-feature-YARN-2928.patch, > YARN-3586-feature-YARN-2928.v2.patch > > > After YARN-3445, RM cache runningApps for each NM. So RM heartbeat back to NM > should only include collectors' address for running applications against > specific NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3586) RM only get back addresses of Collectors that NM needs to know.
[ https://issues.apache.org/jira/browse/YARN-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066727#comment-15066727 ] Junping Du commented on YARN-3586: -- Thanks Varun for review and comments. Will incorporate your comments in next patch. > RM only get back addresses of Collectors that NM needs to know. > --- > > Key: YARN-3586 > URL: https://issues.apache.org/jira/browse/YARN-3586 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, timelineserver >Reporter: Junping Du >Assignee: Junping Du >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: YARN-3586-demo.patch, YARN-3586-feature-YARN-2928.patch > > > After YARN-3445, RM cache runningApps for each NM. So RM heartbeat back to NM > should only include collectors' address for running applications against > specific NM. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2882) Add ExecutionType to denote if a container execution is GUARANTEED or OPPORTUNISTIC
[ https://issues.apache.org/jira/browse/YARN-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1502#comment-1502 ] Hadoop QA commented on YARN-2882: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 56s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} yarn-2877 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 22s {color} | {color:green} yarn-2877 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 33s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 21s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 2s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 15s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s {color} | {color:green} yarn-2877 passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 16s {color} | {color:green} yarn-2877 passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 21s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 21s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 31s {color} | {color:red} Patch generated 3 new checkstyle issues in hadoop-yarn-project/hadoop-yarn (total was 136, now 137). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 55s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 57s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 25s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 5s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 49s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_91. {color} |
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066625#comment-15066625 ] Daniel Templeton commented on YARN-4492: Too many "it"s with unclear antecedents. :) How about: {{If this property is not set for a queue, then the property value is inherited from the queue's parent}} ? > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066618#comment-15066618 ] Naganarasimha G R commented on YARN-4492: - Well i think its almost there but just that property is used twice ! how about {{If this property is not set for a queue, then it inherits from its parent}} > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066614#comment-15066614 ] Daniel Templeton commented on YARN-4492: I like the third one, too. Let's maybe say: {{If this property is not set for a queue, that queue inherits the value for this property from its parent.}} Does that work? > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066593#comment-15066593 ] Naganarasimha G R commented on YARN-4492: - Thanks for the comments [~templedf], bq. A queue's ability to be preempted is inherited from its parent unless explicitly overridden for that queue. will this sound like queue is preemptable? can this be {{A queue's ability for its apps resources to be preempted is inherited from its parent unless explicitly overridden for that queue.}} or {{Preempt-ability of Queue's application resources will be inherited from its parent unless explicitly overridden}} or {{This property of a Queue is inherited from its parent unless explicitly overridden}} thoughts? my choice is 3rd! > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4492) Add documentation for queue level preemption which is supported in Capacity scheduler
[ https://issues.apache.org/jira/browse/YARN-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066578#comment-15066578 ] Daniel Templeton commented on YARN-4492: Minor language edits: {noformat} The `CapacityScheduler` supports the following parameters to control the preemption of containers of applications submitted to a queue: {noformat} should be: {noformat} The `CapacityScheduler` supports the following parameters to control the preemption of application containers submitted to a queue: {noformat} and: {noformat} | `yarn.scheduler.capacity..disable_preemption` | This configuration can be set true, to selectively disable preemption of application containers submitted to a given queue. This will take impact only when system wide preemption is enabled by configuring `yarn.resourcemanager.scheduler.monitor.enable` to *true* and `yarn.resourcemanager.scheduler.monitor.policies` to *ProportionalCapacityPreemptionPolicy*. Preemptability will be inherited from the parent's hierarchy unless explicitly overridden by a queue. Default value is false. {noformat} should maybe be: {noformat} | `yarn.scheduler.capacity..disable_preemption` | This configuration can be set true to selectively disable preemption of application containers submitted to a given queue. This will take effect only when system wide preemption is enabled by configuring `yarn.resourcemanager.scheduler.monitor.enable` to *true* and `yarn.resourcemanager.scheduler.monitor.policies` to *ProportionalCapacityPreemptionPolicy*. A queue's ability to be preempted is inherited from its parent unless explicitly overridden for that queue. The default value is false. {noformat} > Add documentation for queue level preemption which is supported in Capacity > scheduler > - > > Key: YARN-4492 > URL: https://issues.apache.org/jira/browse/YARN-4492 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Reporter: Naganarasimha G R >Assignee: Naganarasimha G R >Priority: Minor > Attachments: CapacityScheduler.html, YARN-4492.v1.001.patch > > > As part of YARN-2056, Support has been added to disable preemption for a > specific queue. This is a useful feature in a multiload cluster but currently > missing documentation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066571#comment-15066571 ] Naganarasimha G R commented on YARN-4479: - Thanks [~sunilg], bq. Its debatable and I think with discussion we can conclude the approach here. True its debate-able, but one more thing to be considered(/not missed) here A4 and A5 gets activated even before A2 (as per the correction i mentioned). bq. {{All containers which were running earlier will still continue}} I mistook what you meant, it seems like you got what i wanted to mention. > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4469) yarn application -status should not show a stack trace for an unknown application ID
[ https://issues.apache.org/jira/browse/YARN-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated YARN-4469: - Fix Version/s: (was: 2.7.0) > yarn application -status should not show a stack trace for an unknown > application ID > > > Key: YARN-4469 > URL: https://issues.apache.org/jira/browse/YARN-4469 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > For example: > {noformat} > # yarn application -status application_1234567890_12345 > Exception in thread "main" > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:190) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy12.getApplicationReport(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:399) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:429) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:154) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:77) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): > Application with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Sub
[jira] [Resolved] (YARN-4469) yarn application -status should not show a stack trace for an unknown application ID
[ https://issues.apache.org/jira/browse/YARN-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved YARN-4469. -- Resolution: Duplicate Fixing resolution type. Normally fix version is only set for JIRAs that have code changes associated with them. > yarn application -status should not show a stack trace for an unknown > application ID > > > Key: YARN-4469 > URL: https://issues.apache.org/jira/browse/YARN-4469 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.7.0 > > > For example: > {noformat} > # yarn application -status application_1234567890_12345 > Exception in thread "main" > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:190) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy12.getApplicationReport(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:399) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:429) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:154) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:77) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): > Application with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.r
[jira] [Reopened] (YARN-4469) yarn application -status should not show a stack trace for an unknown application ID
[ https://issues.apache.org/jira/browse/YARN-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reopened YARN-4469: -- > yarn application -status should not show a stack trace for an unknown > application ID > > > Key: YARN-4469 > URL: https://issues.apache.org/jira/browse/YARN-4469 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.7.0 > > > For example: > {noformat} > # yarn application -status application_1234567890_12345 > Exception in thread "main" > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:190) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy12.getApplicationReport(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:399) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:429) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:154) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:77) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): > Application with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subjec
[jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
[ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066553#comment-15066553 ] Sunil G commented on YARN-4479: --- Thanks [~Naganarasimha Garla] fr the comments. bq.This patch tries to activate all applications which were running before RM restart happened Being said this, yes its definitely depends on available AM limit after restart (I meant the positive case in my earlier comment where all cluster resource were available). I did think about the case when some NMs are not registered back, and limit is lesser. In that case, we will have app-A1 pending in the list to get activated. And this application will be the one which will be activated first if any space is available. This ensures that high priority apps which were in the pending list will get containers, and app-A1 which were low in priority will wait. Even though A1 is activated, it has to wait till other high priority apps are done with its request. So A1 in pending list is may be fine provided other apps are completed sooner or failed NMs are up. But I am not saying its correct. Its debatable and I think with discussion we can conclude the approach here. Also abt {{All containers which were running earlier will still continue}}, I meant about the live containers of apps which were running prior to restart. After restart, even for the pending apps (apps like A1) as mentioned in ur scenario, its running containers wont be killed. Am I missing something? > Retrospect app-priority in pendingOrderingPolicy during recovering > applications > --- > > Key: YARN-4479 > URL: https://issues.apache.org/jira/browse/YARN-4479 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, resourcemanager >Reporter: Rohith Sharma K S >Assignee: Rohith Sharma K S > Attachments: 0001-YARN-4479.patch > > > Currently, same ordering policy is used for pending applications and active > applications. When priority is configured for an applications, during > recovery high priority application get activated first. It is possible that > low priority job was submitted and running state. > This causes low priority job in starvation after recovery -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (YARN-4469) yarn application -status should not show a stack trace for an unknown application ID
[ https://issues.apache.org/jira/browse/YARN-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Templeton resolved YARN-4469. Resolution: Not A Problem Fix Version/s: 2.7.0 > yarn application -status should not show a stack trace for an unknown > application ID > > > Key: YARN-4469 > URL: https://issues.apache.org/jira/browse/YARN-4469 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > Fix For: 2.7.0 > > > For example: > {noformat} > # yarn application -status application_1234567890_12345 > Exception in thread "main" > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:190) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy12.getApplicationReport(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:399) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:429) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:154) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:77) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): > Application with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivil
[jira] [Commented] (YARN-4469) yarn application -status should not show a stack trace for an unknown application ID
[ https://issues.apache.org/jira/browse/YARN-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066539#comment-15066539 ] Daniel Templeton commented on YARN-4469: Thanks, [~Prabhu Joseph]. > yarn application -status should not show a stack trace for an unknown > application ID > > > Key: YARN-4469 > URL: https://issues.apache.org/jira/browse/YARN-4469 > Project: Hadoop YARN > Issue Type: Improvement > Components: client >Affects Versions: 2.7.1 >Reporter: Daniel Templeton >Assignee: Daniel Templeton > > For example: > {noformat} > # yarn application -status application_1234567890_12345 > Exception in thread "main" > org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application > with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) > at > org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:101) > at > org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:190) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy12.getApplicationReport(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getApplicationReport(YarnClientImpl.java:399) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.printApplicationReport(ApplicationCLI.java:429) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.run(ApplicationCLI.java:154) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) > at > org.apache.hadoop.yarn.client.cli.ApplicationCLI.main(ApplicationCLI.java:77) > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException): > Application with id 'application_1234567890_12345' doesn't exist in RM. > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:324) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:170) > at > org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:401) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Na
[jira] [Commented] (YARN-4234) New put APIs in TimelineClient for ats v1.5
[ https://issues.apache.org/jira/browse/YARN-4234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066537#comment-15066537 ] Junping Du commented on YARN-4234: -- Hi Xuan, thanks for updating the patch! There are still some checksyte issues reported related to your patch. I know some of them are invalid, like "File length is 2,385 lines" or "More than 7 parameters", but some is valid like naming issue for "timelineEntityGroupIdStrPrefix". Can you fix the left issues? We are getting quite closed now. Thanks for the patience. > New put APIs in TimelineClient for ats v1.5 > --- > > Key: YARN-4234 > URL: https://issues.apache.org/jira/browse/YARN-4234 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Xuan Gong >Assignee: Xuan Gong > Attachments: YARN-4234-2015-11-13.1.patch, > YARN-4234-2015-11-16.1.patch, YARN-4234-2015-11-16.2.patch, > YARN-4234-2015.2.patch, YARN-4234.1.patch, YARN-4234.2.patch, > YARN-4234.2015-11-12.1.patch, YARN-4234.2015-11-12.1.patch, > YARN-4234.2015-11-18.1.patch, YARN-4234.2015-11-18.2.patch, > YARN-4234.2015-11-18.patch, YARN-4234.2015-12-09.patch, > YARN-4234.2015-12-09.patch, YARN-4234.2015-12-17.1.patch, > YARN-4234.2015-12-18.1.patch, YARN-4234.2015-12-18.patch, > YARN-4234.20151109.patch, YARN-4234.20151110.1.patch, > YARN-4234.2015.1.patch, YARN-4234.3.patch > > > In this ticket, we will add new put APIs in timelineClient to let > clients/applications have the option to use ATS v1.5 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066536#comment-15066536 ] Daniel Templeton commented on YARN-4494: Thanks for filing the JIRA, [~hex108]. One reason why the app recovery is synchronous is that asynchronous recovery can cause the RM to tell a client that a job doesn't exist, when it really just hasn't been recovered yet, which is an issue even with completed jobs. How are you planning to handle that? > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4494) Recover completed apps asynchronously
[ https://issues.apache.org/jira/browse/YARN-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066534#comment-15066534 ] Naganarasimha G R commented on YARN-4494: - +1 for this approach, Possibly we can either place completed apps in another zoo keeper node/hierarchy or just keep completed apps in some data structure and recover them asynchronously after recovering apps which are not yet completed. > Recover completed apps asynchronously > - > > Key: YARN-4494 > URL: https://issues.apache.org/jira/browse/YARN-4494 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Jun Gong >Assignee: Jun Gong > > With RM HA enabled, when recovering apps, recover completed apps > asynchronously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)