[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173396#comment-15173396 ] Hadoop QA commented on YARN-4062: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 50s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 3s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 56s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 15s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 52s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} hadoop-yarn-project/hadoop-yarn: patch generated 0 new + 210 unchanged - 1 fixed = 210 total (was 211) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 5s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 20s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 29s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 41s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} Patch does not generate ASF License warn
[jira] [Commented] (YARN-4741) RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event queue
[ https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173395#comment-15173395 ] sandflee commented on YARN-4741: without the fix of YARN-3990 and YARN-3896, our rm was flooded by node up/down events, and node is synced. and have the same output in NM. {quote} 2016-02-18 01:39:43,217 WARN org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Node is out of sync with ResourceManager, hence resyncing. 2016-02-18 01:39:43,217 WARN org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from ResourceManager: Too far behind rm response id:100314 nm response id:0 {quote} things may like that: 1, nm restarted, ResourceTrackerService send a NodeReconnectEvent to reset response id to 0, 2, nodeHeartBeat is processed before NodeReconnectEvent is handled(dispatcher is flooded by RMAppNodeUpateEvent), RM send sync command to NM for mismatch of response id, 3, rmNode comes to REBOOT status, and remove it from rmContext.activeNodes 4, nm register, create a new rmNode, added to rmContext.activeNodes and send NodeStartEvent 5, scheduler complete the container running on node, to AM container, will send FINISHED_CONTAINERS_PULLED_BY_AM event to RMNode , but the RMNode is in NEW state, couldn't handle FINISHED_CONTAINERS_PULLED_BY_AM. > RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async > dispatcher event queue > --- > > Key: YARN-4741 > URL: https://issues.apache.org/jira/browse/YARN-4741 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Sangjin Lee >Priority: Critical > Attachments: nm.log > > > We had a pretty major incident with the RM where it was continually flooded > with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event > queue. > In our setup, we had the RM HA or stateful restart *disabled*, but NM > work-preserving restart *enabled*. Due to other issues, we did a cluster-wide > NM restart. > Some time during the restart (which took multiple hours), we started seeing > the async dispatcher event queue building. Normally it would log 1,000. In > this case, it climbed all the way up to tens of millions of events. > When we looked at the RM log, it was full of the following messages: > {noformat} > 2016-02-18 01:47:29,530 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > {noformat} > And that node in question was restarted a few minutes earlier. > When we inspected the RM heap, it was full of > RMNodeFinishedContainersPulledByAMEvents. > Suspecting the NM work-preserving restart, we disabled it and did another > cluster-wide rolling restart. Initially that seemed to have helped reduce the > queue size, but the queue built back up to several millions and continued for > an extended period. We had to restart the RM to resolve the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4750) App metrics may not be correct when an app is recovered
[ https://issues.apache.org/jira/browse/YARN-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lavkesh Lahngir updated YARN-4750: -- Description: App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are saved in the state store when there is an attempt state transition. Values for running attempts will be in memory and will not be saved when there is an RM restart/failover. For recovered app metrics value will be reset. In that case, these values will be incomplete. Was this intentional or have we not found a correct way to fix it? was: App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are saved in the state store when there is an attempt state transition. Values for running attempts will be in memory and will not be saved when there is an RM restart/failover. For recovered app metrics value will be reset. In that case, the value will be incomplete. Was this intentional or have we not found a correct way to fix it ? > App metrics may not be correct when an app is recovered > --- > > Key: YARN-4750 > URL: https://issues.apache.org/jira/browse/YARN-4750 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Lavkesh Lahngir >Assignee: Lavkesh Lahngir > > App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are > saved in the state store when there is an attempt state transition. Values > for running attempts will be in memory and will not be saved when there is an > RM restart/failover. For recovered app metrics value will be reset. In that > case, these values will be incomplete. > Was this intentional or have we not found a correct way to fix it? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4750) App metrics may not be correct when an app is recovered
[ https://issues.apache.org/jira/browse/YARN-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lavkesh Lahngir updated YARN-4750: -- Summary: App metrics may not be correct when an app is recovered (was: App metrics may not be correct when and app is recovered) > App metrics may not be correct when an app is recovered > --- > > Key: YARN-4750 > URL: https://issues.apache.org/jira/browse/YARN-4750 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Lavkesh Lahngir >Assignee: Lavkesh Lahngir > > App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are > saved in the state store when there is an attempt state transition. Values > for running attempts will be in memory and will not be saved when there is an > RM restart/failover. For recovered app metrics value will be reset. In that > case, the value will be incomplete. > Was this intentional or have we not found a correct way to fix it ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4750) App metrics may not be correct when and app is recovered
Lavkesh Lahngir created YARN-4750: - Summary: App metrics may not be correct when and app is recovered Key: YARN-4750 URL: https://issues.apache.org/jira/browse/YARN-4750 Project: Hadoop YARN Issue Type: Bug Reporter: Lavkesh Lahngir Assignee: Lavkesh Lahngir App metrics(rather app attempt metrics) like Vcore-seconds and MB-seconds are saved in the state store when there is an attempt state transition. Values for running attempts will be in memory and will not be saved when there is an RM restart/failover. For recovered app metrics value will be reset. In that case, the value will be incomplete. Was this intentional or have we not found a correct way to fix it ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4734) Merge branch:YARN-3368 to trunk
[ https://issues.apache.org/jira/browse/YARN-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173332#comment-15173332 ] Wangda Tan commented on YARN-4734: -- Thanks for looking at this, [~chris.douglas]. Yes this is a WIP patch, I was pulled into other stuffs that I haven't finished the whole patch. Will keep this JIRA updated. For merge it at the top level, did you mean LICENSE.txt and BUILDING.txt? Are there any other files I need to change? > Merge branch:YARN-3368 to trunk > --- > > Key: YARN-4734 > URL: https://issues.apache.org/jira/browse/YARN-4734 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4734.1.patch, YARN-4734.2.patch, YARN-4734.3.patch > > > YARN-2928 branch is planned to merge back to trunk shortly, it depends on > changes of YARN-3368. This JIRA is to track the merging task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173327#comment-15173327 ] Hadoop QA commented on YARN-4715: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 3s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 10s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 2s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 21s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 1 new + 228 unchanged - 3 fixed = 229 total (was 231) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 5s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 18s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 57s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 15s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflic
[jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries
[ https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173319#comment-15173319 ] Wangda Tan commented on YARN-4719: -- Hi [~kasha], Thanks for working on this patch, this is very useful. Took a very quick look at the patch, few comments: - For ClusterNodeTracker#nodes, can we use lock-free data structure to avoid copying the whole set? - We'd better not add addBlacklistedNodeIdsToList to ClusterNodeTracker since it calls application's logic, we should only include node related stuffs to ClusterNodeTracker. > Add a helper library to maintain node state and allows common queries > - > > Key: YARN-4719 > URL: https://issues.apache.org/jira/browse/YARN-4719 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4719-1.patch, yarn-4719-2.patch > > > The scheduler could use a helper library to maintain node state and allowing > matching/sorting queries. Several reasons for this: > # Today, a lot of the node state management is done separately in each > scheduler. Having a single library will take us that much closer to reducing > duplication among schedulers. > # Adding a filtering/matching API would simplify node labels and locality > significantly. > # An API that returns a sorted list for a custom comparator would help > YARN-1011 where we want to sort by allocation and utilization for > continuous/asynchronous and opportunistic scheduling respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-4062: - Attachment: YARN-4062-YARN-2928.07.patch > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.04.patch, > YARN-4062-YARN-2928.05.patch, YARN-4062-YARN-2928.06.patch, > YARN-4062-YARN-2928.07.patch, YARN-4062-YARN-2928.1.patch, > YARN-4062-feature-YARN-2928.01.patch, YARN-4062-feature-YARN-2928.02.patch, > YARN-4062-feature-YARN-2928.03.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173303#comment-15173303 ] Hadoop QA commented on YARN-4062: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 30s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 30s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 33s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 58s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 29s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 38s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 2 new + 210 unchanged - 1 fixed = 212 total (was 211) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 20s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 36s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 4m 26s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {col
[jira] [Commented] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173294#comment-15173294 ] Sidharta Seethana commented on YARN-4744: - Thanks, I'll assign the issue to myself - I hope to be able to get to this soon. > Too many signal to container failure in case of LCE > --- > > Key: YARN-4744 > URL: https://issues.apache.org/jira/browse/YARN-4744 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt > > Install HA cluster in secure mode > Enable LCE with cgroups > Start server with dsperf user > Submit mapreduce application terasort/teragen with user yarn/dsperf > Too many signal to container failure > Submit with user the exception is thrown > {noformat} > 2014-03-02 09:20:38,689 INFO > SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: > Authorization successful for testing (auth:TOKEN) for protocol=interface > org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB > 2014-03-02 09:20:40,158 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Event EventType: KILL_CONTAINER sent to absent container > container_e02_1393731146548_0001_01_13 > 2014-03-02 09:20:43,071 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Container container_e02_1393731146548_0001_01_09 succeeded > 2014-03-02 09:20:43,072 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_e02_1393731146548_0001_01_09 transitioned from > RUNNING to EXITED_WITH_SUCCESS > 2014-03-02 09:20:43,073 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_e02_1393731146548_0001_01_09 > 2014-03-02 09:20:43,075 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Using container runtime: DefaultLinuxContainerRuntime > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 9. Privileged Execution Operation Output: > main : command provided 2 > main : run as user is yarn > main : requested yarn user is yarn > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > yarn, yarn, 2, 9370, 15] > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Signal container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=9: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=9: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 9 more > 2014-03-02 09:20:43,113 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=yarn > OPERATION=Container Finished - SucceededTARGET=ContainerImpl > RESULT=SUC
[jira] [Assigned] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sidharta Seethana reassigned YARN-4744: --- Assignee: Sidharta Seethana > Too many signal to container failure in case of LCE > --- > > Key: YARN-4744 > URL: https://issues.apache.org/jira/browse/YARN-4744 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Sidharta Seethana > > Install HA cluster in secure mode > Enable LCE with cgroups > Start server with dsperf user > Submit mapreduce application terasort/teragen with user yarn/dsperf > Too many signal to container failure > Submit with user the exception is thrown > {noformat} > 2014-03-02 09:20:38,689 INFO > SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: > Authorization successful for testing (auth:TOKEN) for protocol=interface > org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB > 2014-03-02 09:20:40,158 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Event EventType: KILL_CONTAINER sent to absent container > container_e02_1393731146548_0001_01_13 > 2014-03-02 09:20:43,071 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Container container_e02_1393731146548_0001_01_09 succeeded > 2014-03-02 09:20:43,072 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_e02_1393731146548_0001_01_09 transitioned from > RUNNING to EXITED_WITH_SUCCESS > 2014-03-02 09:20:43,073 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_e02_1393731146548_0001_01_09 > 2014-03-02 09:20:43,075 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Using container runtime: DefaultLinuxContainerRuntime > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 9. Privileged Execution Operation Output: > main : command provided 2 > main : run as user is yarn > main : requested yarn user is yarn > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > yarn, yarn, 2, 9370, 15] > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Signal container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=9: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=9: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 9 more > 2014-03-02 09:20:43,113 INFO > org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=yarn > OPERATION=Container Finished - SucceededTARGET=ContainerImpl > RESULT=SUCCESS APPID=application_1393731146548_0001 > CONTAINERI
[jira] [Updated] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4715: Attachment: YARN-4715-YARN-3926.005.patch Uploaded the an earlier patch by mistake. Attaching the right version. > Add support to read resource types from a config file > - > > Key: YARN-4715 > URL: https://issues.apache.org/jira/browse/YARN-4715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4715-YARN-3926.001.patch, > YARN-4715-YARN-3926.002.patch, YARN-4715-YARN-3926.003.patch, > YARN-4715-YARN-3926.004.patch, YARN-4715-YARN-3926.005.patch > > > This ticket is to add support to allow the RM to read the resource types to > be used for scheduling from a config file. I'll file follow up tickets to add > similar support in the NM as well as to handle the RM-NM handshake protocol > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4715: Attachment: (was: YARN-4715-YARN-3926.004.patch) > Add support to read resource types from a config file > - > > Key: YARN-4715 > URL: https://issues.apache.org/jira/browse/YARN-4715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4715-YARN-3926.001.patch, > YARN-4715-YARN-3926.002.patch, YARN-4715-YARN-3926.003.patch, > YARN-4715-YARN-3926.004.patch > > > This ticket is to add support to allow the RM to read the resource types to > be used for scheduling from a config file. I'll file follow up tickets to add > similar support in the NM as well as to handle the RM-NM handshake protocol > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4715: Attachment: YARN-4715-YARN-3926.004.patch Uploaded a new patch with checkstyle fixes. > Add support to read resource types from a config file > - > > Key: YARN-4715 > URL: https://issues.apache.org/jira/browse/YARN-4715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4715-YARN-3926.001.patch, > YARN-4715-YARN-3926.002.patch, YARN-4715-YARN-3926.003.patch, > YARN-4715-YARN-3926.004.patch > > > This ticket is to add support to allow the RM to read the resource types to > be used for scheduling from a config file. I'll file follow up tickets to add > similar support in the NM as well as to handle the RM-NM handshake protocol > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4062) Add the flush and compaction functionality via coprocessors and scanners for flow run table
[ https://issues.apache.org/jira/browse/YARN-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vrushali C updated YARN-4062: - Attachment: YARN-4062-YARN-2928.06.patch Uploading one more patch to fix checkstyle warnings > Add the flush and compaction functionality via coprocessors and scanners for > flow run table > --- > > Key: YARN-4062 > URL: https://issues.apache.org/jira/browse/YARN-4062 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Reporter: Vrushali C >Assignee: Vrushali C > Labels: yarn-2928-1st-milestone > Attachments: YARN-4062-YARN-2928.04.patch, > YARN-4062-YARN-2928.05.patch, YARN-4062-YARN-2928.06.patch, > YARN-4062-YARN-2928.1.patch, YARN-4062-feature-YARN-2928.01.patch, > YARN-4062-feature-YARN-2928.02.patch, YARN-4062-feature-YARN-2928.03.patch > > > As part of YARN-3901, coprocessor and scanner is being added for storing into > the flow_run table. It also needs a flush & compaction processing in the > coprocessor and perhaps a new scanner to deal with the data during flushing > and compaction stages. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4734) Merge branch:YARN-3368 to trunk
[ https://issues.apache.org/jira/browse/YARN-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173255#comment-15173255 ] Chris Douglas commented on YARN-4734: - {{LICENSE.txt}} looks like it is based on, or copied from Apache Tez. Could you double-check the set of modules to ensure it's correct for Hadoop? We'll also need to merge it at the top level. > Merge branch:YARN-3368 to trunk > --- > > Key: YARN-4734 > URL: https://issues.apache.org/jira/browse/YARN-4734 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Wangda Tan > Attachments: YARN-4734.1.patch, YARN-4734.2.patch, YARN-4734.3.patch > > > YARN-2928 branch is planned to merge back to trunk shortly, it depends on > changes of YARN-3368. This JIRA is to track the merging task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173251#comment-15173251 ] Bibin A Chundatt commented on YARN-4744: Hi [~sidharta-s] Thank you for looking into the issue. # Is security enabled ? Yes # Is this problem reproducible ? Yes always , submit mapreduce job from cli {noformat} 2014-03-02 09:20:43,073 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_e02_1393731146548_0001_01_09 {noformat} Container cleanup is getting called for containers already EXITED_WITH_SUCCESS too. Update the description logs too. > Too many signal to container failure in case of LCE > --- > > Key: YARN-4744 > URL: https://issues.apache.org/jira/browse/YARN-4744 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt > > Install HA cluster in secure mode > Enable LCE with cgroups > Start server with dsperf user > Submit mapreduce application terasort/teragen with user yarn/dsperf > Too many signal to container failure > Submit with user the exception is thrown > {noformat} > 2014-03-02 09:20:38,689 INFO > SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: > Authorization successful for testing (auth:TOKEN) for protocol=interface > org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB > 2014-03-02 09:20:40,158 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: > Event EventType: KILL_CONTAINER sent to absent container > container_e02_1393731146548_0001_01_13 > 2014-03-02 09:20:43,071 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Container container_e02_1393731146548_0001_01_09 succeeded > 2014-03-02 09:20:43,072 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: > Container container_e02_1393731146548_0001_01_09 transitioned from > RUNNING to EXITED_WITH_SUCCESS > 2014-03-02 09:20:43,073 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: > Cleaning up container container_e02_1393731146548_0001_01_09 > 2014-03-02 09:20:43,075 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Using container runtime: DefaultLinuxContainerRuntime > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 9. Privileged Execution Operation Output: > main : command provided 2 > main : run as user is yarn > main : requested yarn user is yarn > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > yarn, yarn, 2, 9370, 15] > 2014-03-02 09:20:43,081 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Signal container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=9: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=9: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:11
[jira] [Updated] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4744: --- Description: Install HA cluster in secure mode Enable LCE with cgroups Start server with dsperf user Submit mapreduce application terasort/teragen with user yarn/dsperf Too many signal to container failure Submit with user the exception is thrown {noformat} 2014-03-02 09:20:38,689 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager: Authorization successful for testing (auth:TOKEN) for protocol=interface org.apache.hadoop.yarn.server.nodemanager.api.LocalizationProtocolPB 2014-03-02 09:20:40,158 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl: Event EventType: KILL_CONTAINER sent to absent container container_e02_1393731146548_0001_01_13 2014-03-02 09:20:43,071 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Container container_e02_1393731146548_0001_01_09 succeeded 2014-03-02 09:20:43,072 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_e02_1393731146548_0001_01_09 transitioned from RUNNING to EXITED_WITH_SUCCESS 2014-03-02 09:20:43,073 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch: Cleaning up container container_e02_1393731146548_0001_01_09 2014-03-02 09:20:43,075 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime 2014-03-02 09:20:43,081 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output: main : command provided 2 main : run as user is yarn main : requested yarn user is yarn Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 9370, 15] 2014-03-02 09:20:43,081 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) at java.lang.Thread.run(Thread.java:745) Caused by: ExitCodeException exitCode=9: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) ... 9 more 2014-03-02 09:20:43,113 INFO org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=yarn OPERATION=Container Finished - SucceededTARGET=ContainerImpl RESULT=SUCCESS APPID=application_1393731146548_0001 CONTAINERID=container_e02_1393731146548_0001_01_09 2014-03-02 09:20:43,115 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_e02_1393731146548_0001_01_09 transitioned from EXITED_WITH_SUCCESS to DONE 2014-03-02 09:20:43,115 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl: Removing container_e02_1393731146548_0001_01_09 from application application_1393731146548_0001 {noformat} Checked the same scenario in 2.7.2 version (not available)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173238#comment-15173238 ] Hadoop QA commented on YARN-2883: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 39s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 37s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 5s {color} | {color:green} yarn-2877 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 20s {color} | {color:green} yarn-2877 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 6s {color} | {color:green} yarn-2877 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 58s {color} | {color:green} yarn-2877 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 46s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common in yarn-2877 has 3 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 11s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager in yarn-2877 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} yarn-2877 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 9s {color} | {color:green} yarn-2877 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 11s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 36 new + 232 unchanged - 2 fixed = 268 total (was 234) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 33 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 5s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 5s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in
[jira] [Updated] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4744: --- Description: Install HA cluster in secure mode Enable LCE with cgroups Start server with dsperf user Submit application with user yarn Too many signal to container failure {noformat} 2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output: main : command provided 2 main : run as user is yarn main : requested yarn user is yarn Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15] 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) at java.lang.Thread.run(Thread.java:745) Caused by: ExitCodeException exitCode=9: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) ... 9 more {noformat} Checked the same scenario in 2.7.2 version (not available) was: Enable LCE with cgroups Start server with dsperf user Submit application with user yarn Too many signal to container failure {noformat} 2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output: main : command provided 2 main : run as user is yarn main : requested yarn user is yarn Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15] 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.c
[jira] [Updated] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4744: --- Description: Install HA cluster in secure mode Enable LCE with cgroups Start server with dsperf user Submit mapreduce application terasort/teragen with user yarn/dsperf Too many signal to container failure Submit with user the exception is thrown {noformat} 2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output: main : command provided 2 main : run as user is yarn main : requested yarn user is yarn Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15] 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) at java.lang.Thread.run(Thread.java:745) Caused by: ExitCodeException exitCode=9: at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) at org.apache.hadoop.util.Shell.run(Shell.java:838) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) ... 9 more {noformat} Checked the same scenario in 2.7.2 version (not available) was: Install HA cluster in secure mode Enable LCE with cgroups Start server with dsperf user Submit application with user yarn Too many signal to container failure {noformat} 2014-03-01 14:10:32,223 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: Using container runtime: DefaultLinuxContainerRuntime 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: Shell execution returned exit code: 9. Privileged Execution Operation Output: main : command provided 2 main : run as user is yarn main : requested yarn user is yarn Full command array for failed execution: [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, yarn, yarn, 2, 28575, 15] 2014-03-01 14:10:32,228 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: Signal container failed. Exception: org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: ExitCodeException exitCode=9: at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) at org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecu
[jira] [Commented] (YARN-4749) Generalize config file handling in container-executor
[ https://issues.apache.org/jira/browse/YARN-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173176#comment-15173176 ] Hadoop QA commented on YARN-4749: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 28s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 30s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 9s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 45s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 12s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790649/YARN-4749.001.patch | | JIRA Issue | YARN-4749 | | Optional Tests | asflicense compile cc mvnsite javac unit | | uname | Linux b793f64df541 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d93c22e | | Default Java | 1.7.0_95 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 | | JDK v1.7.0_95 Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/10671/testReport/ | | modules | C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10671/console | | Powered by | Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Generalize config file handling in container-executor > - > > Key: YARN-4749 > URL: https://issues.apache.org/jira/browse/YARN-4749 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >
[jira] [Commented] (YARN-4743) ResourceManager crash because TimSort
[ https://issues.apache.org/jira/browse/YARN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173162#comment-15173162 ] Karthik Kambatla commented on YARN-4743: [~gzh1992n] - thanks for reporting and working on this. I haven't had a chance to look at it closely enough. Will take me a couple of days to do so. On the surface, it seems benign to sort a snapshot of Schedulables. The other way would be to use ReadWriteLock in FSQueue: getters would all try to get a readLock while the sort holds the write lock? > ResourceManager crash because TimSort > - > > Key: YARN-4743 > URL: https://issues.apache.org/jira/browse/YARN-4743 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.6.4 >Reporter: Zephyr Guo > > {code} > 2016-02-26 14:08:50,821 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeCollapse(TimSort.java:410) > at java.util.TimSort.sort(TimSort.java:214) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) > at java.lang.Thread.run(Thread.java:745) > 2016-02-26 14:08:50,822 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. > {code} > Actually, this issue found in 2.6.0-cdh5.4.7. > I think the cause is that we modify {{Resouce}} while we are sorting > {{runnableApps}}. > {code:title=FSLeafQueue.java} > Comparator comparator = policy.getComparator(); > writeLock.lock(); > try { > Collections.sort(runnableApps, comparator); > } finally { > writeLock.unlock(); > } > readLock.lock(); > {code} > {code:title=FairShareComparator} > public int compare(Schedulable s1, Schedulable s2) { > .. > s1.getResourceUsage(), minShare1); > boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, > s2.getResourceUsage(), minShare2); > minShareRatio1 = (double) s1.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare1, > ONE).getMemory(); > minShareRatio2 = (double) s2.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare2, > ONE).getMemory(); > .. > {code} > {{getResourceUsage}} will return current Resource. The current Resource is > unstable. > {code:title=FSAppAttempt.java} > @Override > public Resource getResourceUsage() { > // Here the getPreemptedResources() always return zero, except in > // a preemption round > return Resources.subtract(getCurrentConsumption(), > getPreemptedResources()); > } > {code} > {code:title=SchedulerApplicationAttempt} > public Resource getCurrentConsumption() { > return currentConsumption; > } > // This method may modify current Resource. > public synchronized void recoverContainer(RMContainer rmContainer) { > .. > Resources.addTo(currentConsumption, rmContainer.getContainer() > .getResource()); > .. > } > {code} > I suggest that use stable Resource in comparator. > Is there something i think wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173158#comment-15173158 ] Hadoop QA commented on YARN-4517: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 34s {color} | {color:red} Patch generated 95 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 21s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790648/YARN-4517-YARN-3368.02.patch | | JIRA Issue | YARN-4517 | | Optional Tests | asflicense | | uname | Linux 4b7d3a22c9be 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3368 / 37455e7 | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/10672/artifact/patchprocess/patch-asflicense-problems.txt | | modules | C: hadoop-yarn-project/hadoop-yarn U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10672/console | | Powered by | Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4743) ResourceManager crash because TimSort
[ https://issues.apache.org/jira/browse/YARN-4743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173146#comment-15173146 ] Zephyr Guo commented on YARN-4743: -- {quote} I think that DRF comparator is not transitive with my intuition. {quote} I think that's right.[~ozawa] FairShareComparator uses {{getResourceUsage()}} and {{getDemand()}} and {{getMinShare()}} to implement {{compare(Schedulable s1, Schedulable s1)}}.The three methods must return same Resource anyway while we are sorting, otherwise will break transitivity. How about add snapshot feature in Schedulable? We snapshot Schedulable before sorting.Then we sort but use snapshot Resource in comparator . Result of sorting will very close to real situation, because sorting is very fast. > ResourceManager crash because TimSort > - > > Key: YARN-4743 > URL: https://issues.apache.org/jira/browse/YARN-4743 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.6.4 >Reporter: Zephyr Guo > > {code} > 2016-02-26 14:08:50,821 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in > handling event type NODE_UPDATE to the scheduler > java.lang.IllegalArgumentException: Comparison method violates its general > contract! > at java.util.TimSort.mergeHi(TimSort.java:868) > at java.util.TimSort.mergeAt(TimSort.java:485) > at java.util.TimSort.mergeCollapse(TimSort.java:410) > at java.util.TimSort.sort(TimSort.java:214) > at java.util.TimSort.sort(TimSort.java:173) > at java.util.Arrays.sort(Arrays.java:659) > at java.util.Collections.sort(Collections.java:217) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:316) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.assignContainer(FSParentQueue.java:240) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1091) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:989) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1185) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:684) > at java.lang.Thread.run(Thread.java:745) > 2016-02-26 14:08:50,822 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye.. > {code} > Actually, this issue found in 2.6.0-cdh5.4.7. > I think the cause is that we modify {{Resouce}} while we are sorting > {{runnableApps}}. > {code:title=FSLeafQueue.java} > Comparator comparator = policy.getComparator(); > writeLock.lock(); > try { > Collections.sort(runnableApps, comparator); > } finally { > writeLock.unlock(); > } > readLock.lock(); > {code} > {code:title=FairShareComparator} > public int compare(Schedulable s1, Schedulable s2) { > .. > s1.getResourceUsage(), minShare1); > boolean s2Needy = Resources.lessThan(RESOURCE_CALCULATOR, null, > s2.getResourceUsage(), minShare2); > minShareRatio1 = (double) s1.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare1, > ONE).getMemory(); > minShareRatio2 = (double) s2.getResourceUsage().getMemory() > / Resources.max(RESOURCE_CALCULATOR, null, minShare2, > ONE).getMemory(); > .. > {code} > {{getResourceUsage}} will return current Resource. The current Resource is > unstable. > {code:title=FSAppAttempt.java} > @Override > public Resource getResourceUsage() { > // Here the getPreemptedResources() always return zero, except in > // a preemption round > return Resources.subtract(getCurrentConsumption(), > getPreemptedResources()); > } > {code} > {code:title=SchedulerApplicationAttempt} > public Resource getCurrentConsumption() { > return currentConsumption; > } > // This method may modify current Resource. > public synchronized void recoverContainer(RMContainer rmContainer) { > .. > Resources.addTo(currentConsumption, rmContainer.getContainer() > .getResource()); > .. > } > {code} > I suggest that use stable Resource in comparator. > Is there something i think wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries
[ https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173137#comment-15173137 ] Karthik Kambatla commented on YARN-4719: bq. May also be useful to expose functionality in the ClusterNodeTracker to give list of nodes in a rack, nodes that match a label expression etc. (This can possibly be another JIRA too) Absolutely. I wanted to move all existing common functionality into this class in this JIRA, so we can add other helper functionality in the future. bq. I see that you are triggering the update thread on nodeRemoval too. I understand this might generally be useful (since the node removal might change the node ordering), but given this is a refactoring patch, maybe address that separately ? removeNode does a triggerUpdate today too. I just moved it a little. Will fix the import and the test failures here in the next iteration. > Add a helper library to maintain node state and allows common queries > - > > Key: YARN-4719 > URL: https://issues.apache.org/jira/browse/YARN-4719 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4719-1.patch, yarn-4719-2.patch > > > The scheduler could use a helper library to maintain node state and allowing > matching/sorting queries. Several reasons for this: > # Today, a lot of the node state management is done separately in each > scheduler. Having a single library will take us that much closer to reducing > duplication among schedulers. > # Adding a filtering/matching API would simplify node labels and locality > significantly. > # An API that returns a sorted list for a custom comparator would help > YARN-1011 where we want to sort by allocation and utilization for > continuous/asynchronous and opportunistic scheduling respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4749) Generalize config file handling in container-executor
[ https://issues.apache.org/jira/browse/YARN-4749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sidharta Seethana updated YARN-4749: Attachment: YARN-4749.001.patch Uploaded a patch that makes config parsing reusable. > Generalize config file handling in container-executor > - > > Key: YARN-4749 > URL: https://issues.apache.org/jira/browse/YARN-4749 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Reporter: Sidharta Seethana >Assignee: Sidharta Seethana > Attachments: YARN-4749.001.patch > > > The current implementation of container-executor already supports parsing of > key value pairs from a config file. However, it is currently restricted to > {{container-executor.cfg}} and cannot be reused for parsing additional > config/command files. Generalizing this is a required step for YARN-4245. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4517: --- Attachment: (was: YARN-4517-YARN-3368.02.patch) > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4517: --- Attachment: YARN-4517-YARN-3368.02.patch Sorry I had left my local host and port configurations in the patch. Updating the patch after removing them. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173124#comment-15173124 ] Varun Saxena commented on YARN-4746: [~bibinchundatt], although its not mentioned as this part of this JIRA but I think we can extend this check to app attempt id and container id as well. > yarn web services should convert parse failures of appId to 400 > --- > > Key: YARN-4746 > URL: https://issues.apache.org/jira/browse/YARN-4746 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: 0001-YARN-4746.patch > > > I'm seeing somewhere in the WS API tests of mine an error with exception > conversion of a bad app ID sent in as an argument to a GET. I know it's in > ATS, but a scan of the core RM web services implies a same problem > {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} > to convert an argument; this throws IllegalArgumentException, which is then > handled somewhere by jetty as a 500 error. > In fact, it's a bad argument, which should be handled by returning a 400. > This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4749) Generalize config file handling in container-executor
Sidharta Seethana created YARN-4749: --- Summary: Generalize config file handling in container-executor Key: YARN-4749 URL: https://issues.apache.org/jira/browse/YARN-4749 Project: Hadoop YARN Issue Type: Sub-task Components: nodemanager Reporter: Sidharta Seethana Assignee: Sidharta Seethana The current implementation of container-executor already supports parsing of key value pairs from a config file. However, it is currently restricted to {{container-executor.cfg}} and cannot be reused for parsing additional config/command files. Generalizing this is a required step for YARN-4245. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4748) ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport
[ https://issues.apache.org/jira/browse/YARN-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173114#comment-15173114 ] Hudson commented on YARN-4748: -- FAILURE: Integrated in Hadoop-trunk-Commit #9397 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9397/]) YARN-4748. ApplicationHistoryManagerOnTimelineStore should not swallow (jianhe: rev d93c22ec274b1a0f29609217039b80732886fed7) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-applicationhistoryservice/src/main/java/org/apache/hadoop/yarn/server/applicationhistoryservice/ApplicationHistoryManagerOnTimelineStore.java > ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on > generateApplicationReport > --- > > Key: YARN-4748 > URL: https://issues.apache.org/jira/browse/YARN-4748 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: YARN-4748-trunk.001.patch > > > We're directly swallowing AuthorizationExceptions and > ApplicationAttemptNotFoundExceptions when generating application reports. we > should at least mark down the exception before proceed with default values > (which will assign app attempt id to -1). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173112#comment-15173112 ] Sangjin Lee commented on YARN-3863: --- Thanks for the detailed explanation of the changes [~varun_saxena]! It's tremendously helpful. I'll go over the latest patch, and get back to you with comments. {quote} I could not quite get below comment. I did not make any change on line 448. Sangjin, can you elaborate. Maybe you meant some other line. (HBaseTimelineWriterImpl.java) l.448: it should simply be a else if {quote} Sorry I had meant {{TimelineStorageUtils.java}}. I see it's significantly refactored in the latest version, so I'm pretty sure it doesn't apply. > Support complex filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-YARN-2928.v2.01.patch, > YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, > YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4748) ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport
[ https://issues.apache.org/jira/browse/YARN-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173109#comment-15173109 ] Jian He commented on YARN-4748: --- +1 > ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on > generateApplicationReport > --- > > Key: YARN-4748 > URL: https://issues.apache.org/jira/browse/YARN-4748 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Fix For: 2.8.0 > > Attachments: YARN-4748-trunk.001.patch > > > We're directly swallowing AuthorizationExceptions and > ApplicationAttemptNotFoundExceptions when generating application reports. we > should at least mark down the exception before proceed with default values > (which will assign app attempt id to -1). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4746) yarn web services should convert parse failures of appId to 400
[ https://issues.apache.org/jira/browse/YARN-4746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4746: --- Attachment: 0001-YARN-4746.patch > yarn web services should convert parse failures of appId to 400 > --- > > Key: YARN-4746 > URL: https://issues.apache.org/jira/browse/YARN-4746 > Project: Hadoop YARN > Issue Type: Bug > Components: webapp >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Priority: Minor > Attachments: 0001-YARN-4746.patch > > > I'm seeing somewhere in the WS API tests of mine an error with exception > conversion of a bad app ID sent in as an argument to a GET. I know it's in > ATS, but a scan of the core RM web services implies a same problem > {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} > to convert an argument; this throws IllegalArgumentException, which is then > handled somewhere by jetty as a 500 error. > In fact, it's a bad argument, which should be handled by returning a 400. > This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4719) Add a helper library to maintain node state and allows common queries
[ https://issues.apache.org/jira/browse/YARN-4719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173101#comment-15173101 ] Arun Suresh commented on YARN-4719: --- Much needed patch [~kasha].. Took a quick look, some comments # You have an un-used import in the {{FairScheduler}} # I see that you are triggering the update thread on nodeRemoval too. I understand this might generally be useful (since the node removal might change the node ordering), but given this is a refactoring patch, maybe address that separately ? # May also be useful to expose functionality in the {{ClusterNodeTracker}} to give list of nodes in a rack, nodes that match a label expression etc. (This can possibly be another JIRA too) > Add a helper library to maintain node state and allows common queries > - > > Key: YARN-4719 > URL: https://issues.apache.org/jira/browse/YARN-4719 > Project: Hadoop YARN > Issue Type: Improvement > Components: scheduler >Affects Versions: 2.8.0 >Reporter: Karthik Kambatla >Assignee: Karthik Kambatla > Attachments: yarn-4719-1.patch, yarn-4719-2.patch > > > The scheduler could use a helper library to maintain node state and allowing > matching/sorting queries. Several reasons for this: > # Today, a lot of the node state management is done separately in each > scheduler. Having a single library will take us that much closer to reducing > duplication among schedulers. > # Adding a filtering/matching API would simplify node labels and locality > significantly. > # An API that returns a sorted list for a custom comparator would help > YARN-1011 where we want to sort by allocation and utilization for > continuous/asynchronous and opportunistic scheduling respectively. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173066#comment-15173066 ] Hadoop QA commented on YARN-4517: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 26s {color} | {color:red} Patch generated 95 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 1m 12s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790627/YARN-4517-YARN-3368.02.patch | | JIRA Issue | YARN-4517 | | Optional Tests | asflicense | | uname | Linux 024acc4740d3 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | YARN-3368 / 37455e7 | | asflicense | https://builds.apache.org/job/PreCommit-YARN-Build/10669/artifact/patchprocess/patch-asflicense-problems.txt | | modules | C: hadoop-yarn-project/hadoop-yarn U: hadoop-yarn-project/hadoop-yarn | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/10669/console | | Powered by | Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantinos Karanasos updated YARN-2883: - Attachment: YARN-2883-yarn-2877.002.patch Thanks [~asuresh] for the feedback. I am attaching a new version of the patch in which I have applied all your comments. Most notably, among them: * Added the {{QueuingContainerManagerImpl}} and {{QueuingContainersMonitorImpl}} as sub-classes to {{ContainerManagerImpl}} and {{ContainersMonitorImpl}}, respectively. * Created a {{QueuingNMContext}} within the {{NMContext}} of the {{NodeManager}} to hold the queuedContainers and the killedQueuedContainers. * Used "allocated" instead of "logical" in all class/field/method names. * Also implemented the methods that send the number of queued containers from the NM to the RM through the {{NodeStatusUpdaterImpl}}. > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-yarn-2877.001.patch, > YARN-2883-yarn-2877.002.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173026#comment-15173026 ] Varun Saxena commented on YARN-4517: Screenshots have been attached too... > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173022#comment-15173022 ] Varun Saxena commented on YARN-4517: Attached a new patch. [~leftnoteasy], kindly review. The open points mentioned above have been fixed(except adding additional graphs which can be done in another JIRA). As YARN-4709 has gone in, changes have been done according to them. This patch hence is on top of changes in YARN-4709. The overflow on screen resizing on left hand side menu bar has been fixed as well. Moreover, I have added multiple unit test cases. Refer to https://issues.apache.org/jira/browse/YARN-4517?focusedCommentId=15155987&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15155987 to check what has been done in the JIRA. [~leftnoteasy], do you want me to raise multiple JIRAs' and break this patch ? > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4517: --- Attachment: YARN-4517-YARN-3368.02.patch > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch, YARN-4517-YARN-3368.02.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173014#comment-15173014 ] Yufei Gu commented on YARN-4704: Hi, [~kasha] Thank you very much for the code reviewing. > TestResourceManager#testResourceAllocation() fails when using FairScheduler > --- > > Key: YARN-4704 > URL: https://issues.apache.org/jira/browse/YARN-4704 > Project: Hadoop YARN > Issue Type: Test > Components: fairscheduler, test >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Yufei Gu > Fix For: 2.9.0 > > Attachments: YARN-4704.001.patch > > > When using FairScheduler, TestResourceManager#testResourceAllocation() fails > with the following error: > java.lang.IllegalStateException: Trying to stop a non-running task: 1 of > application application_1455833410011_0001 > at > org.apache.hadoop.yarn.server.resourcemanager.Task.stop(Task.java:117) > at > org.apache.hadoop.yarn.server.resourcemanager.Application.finishTask(Application.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager.testResourceAllocation(TestResourceManager.java:167) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4517) [YARN-3368] Add nodes page
[ https://issues.apache.org/jira/browse/YARN-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-4517: --- Attachment: Screenshot_after_4709_1.png Screenshot_after_4709.png > [YARN-3368] Add nodes page > -- > > Key: YARN-4517 > URL: https://issues.apache.org/jira/browse/YARN-4517 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Wangda Tan >Assignee: Varun Saxena > Labels: webui > Attachments: (21-Feb-2016)yarn-ui-screenshots.zip, > Screenshot_after_4709.png, Screenshot_after_4709_1.png, > YARN-4517-YARN-3368.01.patch > > > We need nodes page added to next generation web UI, similar to existing > RM/nodes page. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172997#comment-15172997 ] Hudson commented on YARN-4704: -- FAILURE: Integrated in Hadoop-trunk-Commit #9396 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9396/]) YARN-4704. TestResourceManager#testResourceAllocation() fails when using (kasha: rev 9dafaaaf0de68ce7f5e495ea4b8e0ce036dc35a2) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestResourceManager.java * hadoop-yarn-project/CHANGES.txt > TestResourceManager#testResourceAllocation() fails when using FairScheduler > --- > > Key: YARN-4704 > URL: https://issues.apache.org/jira/browse/YARN-4704 > Project: Hadoop YARN > Issue Type: Test > Components: fairscheduler, test >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Yufei Gu > Fix For: 2.9.0 > > Attachments: YARN-4704.001.patch > > > When using FairScheduler, TestResourceManager#testResourceAllocation() fails > with the following error: > java.lang.IllegalStateException: Trying to stop a non-running task: 1 of > application application_1455833410011_0001 > at > org.apache.hadoop.yarn.server.resourcemanager.Task.stop(Task.java:117) > at > org.apache.hadoop.yarn.server.resourcemanager.Application.finishTask(Application.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager.testResourceAllocation(TestResourceManager.java:167) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172933#comment-15172933 ] Karthik Kambatla commented on YARN-4704: Verified the patch fixes the test with FairScheduler. And, the test failures reported by Jenkins are known and unrelated. +1, checking this in. > TestResourceManager#testResourceAllocation() fails when using FairScheduler > --- > > Key: YARN-4704 > URL: https://issues.apache.org/jira/browse/YARN-4704 > Project: Hadoop YARN > Issue Type: Test > Components: fairscheduler, test >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Yufei Gu > Attachments: YARN-4704.001.patch > > > When using FairScheduler, TestResourceManager#testResourceAllocation() fails > with the following error: > java.lang.IllegalStateException: Trying to stop a non-running task: 1 of > application application_1455833410011_0001 > at > org.apache.hadoop.yarn.server.resourcemanager.Task.stop(Task.java:117) > at > org.apache.hadoop.yarn.server.resourcemanager.Application.finishTask(Application.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager.testResourceAllocation(TestResourceManager.java:167) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena reassigned YARN-4747: -- Assignee: Varun Saxena > AHS error 500 due to NPE when container start event is missing > -- > > Key: YARN-4747 > URL: https://issues.apache.org/jira/browse/YARN-4747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.2 >Reporter: Jason Lowe >Assignee: Varun Saxena > > Saw an error 500 due to a NullPointerException caused by a missing host for > an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4748) ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport
[ https://issues.apache.org/jira/browse/YARN-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172901#comment-15172901 ] Hadoop QA commented on YARN-4748: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 46s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 54s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 21m 56s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790597/YARN-4748-trunk.001.patch | | JIRA Issue | YARN-4748 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 47095768758f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | |
[jira] [Commented] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172861#comment-15172861 ] Hadoop QA commented on YARN-4704: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 29s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 29s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 4s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 20s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 25s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 2s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 20s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 149m 19s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_72 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790557/YARN-4704.001.patch | | JIRA Issue | YARN-4704 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs c
[jira] [Updated] (YARN-4748) ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport
[ https://issues.apache.org/jira/browse/YARN-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Li Lu updated YARN-4748: Attachment: YARN-4748-trunk.001.patch Quick patch to log exceptions when generating application reports. > ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on > generateApplicationReport > --- > > Key: YARN-4748 > URL: https://issues.apache.org/jira/browse/YARN-4748 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Reporter: Li Lu >Assignee: Li Lu > Attachments: YARN-4748-trunk.001.patch > > > We're directly swallowing AuthorizationExceptions and > ApplicationAttemptNotFoundExceptions when generating application reports. we > should at least mark down the exception before proceed with default values > (which will assign app attempt id to -1). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM
[ https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172800#comment-15172800 ] Hadoop QA commented on YARN-4696: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 6s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 14s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 40s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 27s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timeline-pluginstorage in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 3s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 59s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 18s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 34s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 3 new + 29 unchanged - 0 fixed = 32 total (was 29) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 25s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 3s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 9s {color} | {color:green} hadoop-yarn-server-applicationhistoryservice in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 49s {color} | {color:green} hadoop-yarn-server-timeline-pluginstorage in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 17s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK
[jira] [Commented] (YARN-4736) Issues with HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172795#comment-15172795 ] Ted Yu commented on YARN-4736: -- bq. so planning to test with hbase-1.0.3 tar. There have been more release(s) since 1.0.3 release. e.g. you can try out 1.2.0 release. BufferedMutatorImpl#flush() appeared in stack trace. However, if the hbase cluster was shutdown, the flush wouldn't succeed. I haven't seen the above issue happen on a live 1.x cluster. > Issues with HBaseTimelineWriterImpl > --- > > Key: YARN-4736 > URL: https://issues.apache.org/jira/browse/YARN-4736 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Naganarasimha G R >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: hbaseException.log, threaddump.log > > > Faced some issues while running ATSv2 in single node Hadoop cluster and in > the same node had launched Hbase with embedded zookeeper. > # Due to some NPE issues i was able to see NM was trying to shutdown, but the > NM daemon process was not completed due to the locks. > # Got some exception related to Hbase after application finished execution > successfully. > will attach logs and the trace for the same -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4736) Issues with HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172779#comment-15172779 ] Sangjin Lee commented on YARN-4736: --- {quote} Was checking with our hbase team from the logs and the trace they were informing that the issue might be due to bad connectivity with the zookeeper, but it strange to see that in the local node setup. So i suspect that there is some issue with my hbase setup, so planning to test with hbase-1.0.3 tar. {quote} Regardless of what happens on the cluster, the client (NM in this case) should not lock up. So in that sense, I think the "deadlock" we're seeing should be looked at from the client-side. [~te...@apache.org], do you recall seeing any issues like this? {quote} But also would like to know whether you guys are facing the same issue or its only me who is facing it ? {quote} I haven't had a chance to try to reproduce it yet. I'll try it as soon as feasible. Does this happen every time you run the timeline service performance test with the latest branch? > Issues with HBaseTimelineWriterImpl > --- > > Key: YARN-4736 > URL: https://issues.apache.org/jira/browse/YARN-4736 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Naganarasimha G R >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: hbaseException.log, threaddump.log > > > Faced some issues while running ATSv2 in single node Hadoop cluster and in > the same node had launched Hbase with embedded zookeeper. > # Due to some NPE issues i was able to see NM was trying to shutdown, but the > NM daemon process was not completed due to the locks. > # Got some exception related to Hbase after application finished execution > successfully. > will attach logs and the trace for the same -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4748) ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport
Li Lu created YARN-4748: --- Summary: ApplicationHistoryManagerOnTimelineStore should not swallow exceptions on generateApplicationReport Key: YARN-4748 URL: https://issues.apache.org/jira/browse/YARN-4748 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Li Lu Assignee: Li Lu We're directly swallowing AuthorizationExceptions and ApplicationAttemptNotFoundExceptions when generating application reports. we should at least mark down the exception before proceed with default values (which will assign app attempt id to -1). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4696) EntityGroupFSTimelineStore to work in the absence of an RM
[ https://issues.apache.org/jira/browse/YARN-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated YARN-4696: - Attachment: YARN-4696-009.patch This is the 009 patch; the difference with 008 is it is correctly converting IllegalArgumentException to a BadRequestException with the nested stack trace. With this patch applied with the current YARN-4545 patch, I now successfully have # all tests against completed jobs working with file:// # tests needing to track incomplete jobs working with an HDFS minicluster. LocalFS isn't going to work as a destination for incomplete jobs, as it doesn't flush(). Nor will things like S3. That'll need documenting > EntityGroupFSTimelineStore to work in the absence of an RM > -- > > Key: YARN-4696 > URL: https://issues.apache.org/jira/browse/YARN-4696 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: 2.8.0 >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-4696-001.patch, YARN-4696-002.patch, > YARN-4696-003.patch, YARN-4696-005.patch, YARN-4696-006.patch, > YARN-4696-007.patch, YARN-4696-008.patch, YARN-4696-009.patch > > > {{EntityGroupFSTimelineStore}} now depends on an RM being up and running; the > configuration pointing to it. This is a new change, and impacts testing where > you have historically been able to test without an RM running. > The sole purpose of the probe is to automatically determine if an app is > running; it falls back to "unknown" if not. If the RM connection was > optional, the "unknown" codepath could be called directly, relying on age of > file as a metric of completion > Options > # add a flag to disable RM connect > # skip automatically if RM not defined/set to 0.0.0.0 > # disable retries on yarn client IPC; if it fails, tag app as unknown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172671#comment-15172671 ] Jason Lowe commented on YARN-4747: -- I believe this was triggered by a missing container start event for a given container finish event. When an application runs for a long time there will be a corresponding long window between the container start event and container finish event for the AM container. The timelineserver performs retention based on entity timestamp, so there will be a long window where the container start event has been deleted but the container finish event is still present. The application history code is not prepared to handle that, as only the container start event has the node hostname and port number information. It blindly assumes that if a container entity is present in the database then we know both the host and the port. Minimally the application history server needs to be hardened to avoid the NPE, but we may want to add the host and port information to the finish event as well to allow the history page to continue to provide logs as long as there is either a container start or container finish event in the database. > AHS error 500 due to NPE when container start event is missing > -- > > Key: YARN-4747 > URL: https://issues.apache.org/jira/browse/YARN-4747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.2 >Reporter: Jason Lowe > > Saw an error 500 due to a NullPointerException caused by a missing host for > an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4747) AHS error 500 due to NPE when container start event is missing
[ https://issues.apache.org/jira/browse/YARN-4747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172665#comment-15172665 ] Jason Lowe commented on YARN-4747: -- Stacktrace: {noformat} 2016-02-29 16:50:19,465 [1866296659@qtp-46415544-16798] ERROR webapp.AppBlock: Failed to read the AM container of the application attempt appattempt_1455753632268_408876_01. java.lang.NullPointerException at org.apache.hadoop.yarn.proto.YarnProtos$NodeIdProto$Builder.setHost(YarnProtos.java:19772) at org.apache.hadoop.yarn.api.records.impl.pb.NodeIdPBImpl.setHost(NodeIdPBImpl.java:56) at org.apache.hadoop.yarn.api.records.NodeId.newInstance(NodeId.java:42) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.convertToContainerReport(ApplicationHistoryManagerOnTimelineStore.java:529) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryManagerOnTimelineStore.getContainer(ApplicationHistoryManagerOnTimelineStore.java:200) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryClientService.getContainerReport(ApplicationHistoryClientService.java:200) at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:249) at org.apache.hadoop.yarn.server.webapp.AppBlock$3.run(AppBlock.java:243) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1679) at org.apache.hadoop.yarn.server.webapp.AppBlock.generateApplicationTable(AppBlock.java:242) at org.apache.hadoop.yarn.server.webapp.AppBlock.render(AppBlock.java:217) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.render(HtmlBlock.java:69) at org.apache.hadoop.yarn.webapp.view.HtmlBlock.renderPartial(HtmlBlock.java:79) at org.apache.hadoop.yarn.webapp.View.render(View.java:235) at org.apache.hadoop.yarn.webapp.view.HtmlPage$Page.subView(HtmlPage.java:49) at org.apache.hadoop.yarn.webapp.hamlet.HamletImpl$EImp._v(HamletImpl.java:117) at org.apache.hadoop.yarn.webapp.hamlet.Hamlet$TD._(Hamlet.java:845) at org.apache.hadoop.yarn.webapp.view.TwoColumnLayout.render(TwoColumnLayout.java:71) at org.apache.hadoop.yarn.webapp.view.HtmlPage.render(HtmlPage.java:82) at org.apache.hadoop.yarn.webapp.Controller.render(Controller.java:212) at org.apache.hadoop.yarn.server.applicationhistoryservice.webapp.AHSController.app(AHSController.java:38) at sun.reflect.GeneratedMethodAccessor90.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) {noformat} > AHS error 500 due to NPE when container start event is missing > -- > > Key: YARN-4747 > URL: https://issues.apache.org/jira/browse/YARN-4747 > Project: Hadoop YARN > Issue Type: Bug > Components: timelineserver >Affects Versions: 2.7.2 >Reporter: Jason Lowe > > Saw an error 500 due to a NullPointerException caused by a missing host for > an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4747) AHS error 500 due to NPE when container start event is missing
Jason Lowe created YARN-4747: Summary: AHS error 500 due to NPE when container start event is missing Key: YARN-4747 URL: https://issues.apache.org/jira/browse/YARN-4747 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.7.2 Reporter: Jason Lowe Saw an error 500 due to a NullPointerException caused by a missing host for an AM container. Stacktrace to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4593) Deadlock in AbstractService.getConfig()
[ https://issues.apache.org/jira/browse/YARN-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172652#comment-15172652 ] Steve Loughran commented on YARN-4593: -- No tests. I'd have to think of a way to recreate the deadlock then make sure it was gone. someone will need to look at the code instead > Deadlock in AbstractService.getConfig() > --- > > Key: YARN-4593 > URL: https://issues.apache.org/jira/browse/YARN-4593 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.7.2 > Environment: AM restarting on kerberized cluster >Reporter: Steve Loughran >Assignee: Steve Loughran > Attachments: YARN-4593-001.patch > > > SLIDER-1052 has found a deadlock which can arise in it during AM restart. > Looking at the thread trace, one of the blockages is actually > {{AbstractService.getConfig()}} —this is synchronized and so blocked. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172651#comment-15172651 ] Vrushali C commented on YARN-4700: -- Hi [~Naganarasimha] Thanks for the patch. I believe the constructor for FlowActivityRowKey should change to correctly calculate top of the day timestamp given the input timestamp. That is the reason the unit test is failing I think, since the FlowActivityRowKey is constructed with FlowActivityRowKey.getRowKey(clusterStop, appCreatedTime, user, flow). Also,I think we can remove the function FlowActivityRowKey #getRowKey(String clusterId, String userId, String * flowName) and only keep the FlowActivityRowKey# getRowKey(String clusterId, long dayTs, String userId, String flowName) . That way it's easier to clean up the unit tests as well. And I think you can change the unit test to use different timestamps (but keep the same semantics, i.e. min start time should actually be the lowest one etc), that way it will be easier to refactor the unit test. Let me know if this helps. Right now the unit test checks in the flow activity table that one entry has been made for all of these 4 application entities so you can use the timestamps that belong to exactly the same day. Or if you use timestamps belonging to different days, change the test to look for those many entries. Another thing is that, it looks like the event timestamp that is being used is the timelineEvents.next().getTimestamp(). It might be more explicit to fetch the exact created (or finished) event from the TimelineEntity and use the timestamp that belong to either ApplicationMetricsConstants.CREATED_EVENT_TYPE or ApplicationMetricsConstants.FINISHED_EVENT_TYPE. That way, we are using the accurate event time to make an entry into the flow activity table. You can use TimelineStorageUtils # getApplicationFinishedTime() function for getting the timestamp for the FINISHED event. You would have to write a new function to do a similar thing for fetching CREATED event timestamp (or refactor further and use the same function to get the right event's timestamp). Hope this helps.. Let me know.. thanks Vrushali > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4700-YARN-2928.v1.001.patch, > YARN-4700-YARN-2928.wip.patch > > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4741) RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event queue
[ https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172650#comment-15172650 ] Sangjin Lee commented on YARN-4741: --- I attached the node manager log. It's pretty much the entirety of the log from the start until after it's past the point of these events happening for this node in the RM. The only thing I removed is a section early in the log that lists all the localization service recovering files. Unfortunately I no longer have the RM log for this episode. We do not have YARN-3990 or YARN-3896 applied. Although we should get them in any case, I'm not sure if those are related to the issue we're seeing. > RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async > dispatcher event queue > --- > > Key: YARN-4741 > URL: https://issues.apache.org/jira/browse/YARN-4741 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Sangjin Lee >Priority: Critical > Attachments: nm.log > > > We had a pretty major incident with the RM where it was continually flooded > with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event > queue. > In our setup, we had the RM HA or stateful restart *disabled*, but NM > work-preserving restart *enabled*. Due to other issues, we did a cluster-wide > NM restart. > Some time during the restart (which took multiple hours), we started seeing > the async dispatcher event queue building. Normally it would log 1,000. In > this case, it climbed all the way up to tens of millions of events. > When we looked at the RM log, it was full of the following messages: > {noformat} > 2016-02-18 01:47:29,530 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > {noformat} > And that node in question was restarted a few minutes earlier. > When we inspected the RM heap, it was full of > RMNodeFinishedContainersPulledByAMEvents. > Suspecting the NM work-preserving restart, we disabled it and did another > cluster-wide rolling restart. Initially that seemed to have helped reduce the > queue size, but the queue built back up to several millions and continued for > an extended period. We had to restart the RM to resolve the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4741) RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event queue
[ https://issues.apache.org/jira/browse/YARN-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sangjin Lee updated YARN-4741: -- Attachment: nm.log > RM is flooded with RMNodeFinishedContainersPulledByAMEvents in the async > dispatcher event queue > --- > > Key: YARN-4741 > URL: https://issues.apache.org/jira/browse/YARN-4741 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0 >Reporter: Sangjin Lee >Priority: Critical > Attachments: nm.log > > > We had a pretty major incident with the RM where it was continually flooded > with RMNodeFinishedContainersPulledByAMEvents in the async dispatcher event > queue. > In our setup, we had the RM HA or stateful restart *disabled*, but NM > work-preserving restart *enabled*. Due to other issues, we did a cluster-wide > NM restart. > Some time during the restart (which took multiple hours), we started seeing > the async dispatcher event queue building. Normally it would log 1,000. In > this case, it climbed all the way up to tens of millions of events. > When we looked at the RM log, it was full of the following messages: > {noformat} > 2016-02-18 01:47:29,530 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,535 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Can't handle > this event at current state > 2016-02-18 01:47:29,538 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Invalid > event FINISHED_CONTAINERS_PULLED_BY_AM on Node worker-node-foo.bar.net:8041 > {noformat} > And that node in question was restarted a few minutes earlier. > When we inspected the RM heap, it was full of > RMNodeFinishedContainersPulledByAMEvents. > Suspecting the NM work-preserving restart, we disabled it and did another > cluster-wide rolling restart. Initially that seemed to have helped reduce the > queue size, but the queue built back up to several millions and continued for > an extended period. We had to restart the RM to resolve the problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172629#comment-15172629 ] Sidharta Seethana commented on YARN-4744: - Hi [~bibinchundatt], Could you provide some additional information here : Is security enabled ? Is this problem reproducible with included apps - e.g distributed shell ? Is it possible the container exited before the signal was delivered (exit code 9 is possible in this scenario) ? thanks, -Sidharta > Too many signal to container failure in case of LCE > --- > > Key: YARN-4744 > URL: https://issues.apache.org/jira/browse/YARN-4744 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt > > Enable LCE with cgroups > Start server with dsperf user > Submit application with user yarn > Too many signal to container failure > {noformat} > 2014-03-01 14:10:32,223 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Using container runtime: DefaultLinuxContainerRuntime > 2014-03-01 14:10:32,228 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 9. Privileged Execution Operation Output: > main : command provided 2 > main : run as user is yarn > main : requested yarn user is yarn > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > yarn, yarn, 2, 28575, 15] > 2014-03-01 14:10:32,228 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Signal container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=9: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=9: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 9 more > {noformat} > Checked the same scenario in 2.7.2 version (not available) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172617#comment-15172617 ] Hadoop QA commented on YARN-4700: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 56s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 28s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 38s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 18s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 14s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 4s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 3m 2s {color} | {color:red} hadoop-yarn-server-timelineservice in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 34s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_72 Failed junit tests | hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage | | | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity | | JDK v1.7.0_95 Failed junit tests | hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage | | | hadoop.yarn.server.timelineservice.storage.flow.TestHBaseStorageFlowActivity | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790555/YAR
[jira] [Updated] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu updated YARN-4704: --- Attachment: YARN-4704.001.patch > TestResourceManager#testResourceAllocation() fails when using FairScheduler > --- > > Key: YARN-4704 > URL: https://issues.apache.org/jira/browse/YARN-4704 > Project: Hadoop YARN > Issue Type: Test > Components: fairscheduler, test >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Yufei Gu > Attachments: YARN-4704.001.patch > > > When using FairScheduler, TestResourceManager#testResourceAllocation() fails > with the following error: > java.lang.IllegalStateException: Trying to stop a non-running task: 1 of > application application_1455833410011_0001 > at > org.apache.hadoop.yarn.server.resourcemanager.Task.stop(Task.java:117) > at > org.apache.hadoop.yarn.server.resourcemanager.Application.finishTask(Application.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager.testResourceAllocation(TestResourceManager.java:167) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4700: Attachment: YARN-4700-YARN-2928.v1.001.patch Hi [~vrushalic], [~sjlee0] & [~varun_saxena], Please find the attached patch which tries to consider for the app created and as well as app finished event updating the *FlowActivityTable* Only concern is, may be i did not get the complete essence of {{TestHBaseStorageFlowActivity.testWriteFlowRunMinMax}} testcase hence its failing, IIUC there should be 4 entries in FlowActivity table as these are 4 diff apps of the same flow right ? correct me if i am wrong. > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4700-YARN-2928.v1.001.patch, > YARN-4700-YARN-2928.wip.patch > > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4593) Deadlock in AbstractService.getConfig()
[ https://issues.apache.org/jira/browse/YARN-4593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172472#comment-15172472 ] Hadoop QA commented on YARN-4593: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 48s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 1s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 55s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 56s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 59s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 10s {color} | {color:green} hadoop-common in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 35s {color} | {color:green} hadoop-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 22s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 53s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12782366/YARN-4593-001.patch | | JIRA Issue | YARN-4593 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux be0b5e7d5f34 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provid
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172453#comment-15172453 ] Naganarasimha G R commented on YARN-4700: - may be small correction in the patch, as {{storeInFlowActivityTable}} is used in both app created and app finished i need to find the time from the first event of the entity ? > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4700-YARN-2928.wip.patch > > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-4700: Attachment: YARN-4700-YARN-2928.wip.patch [~sjlee0] & [~vrushalic], i meant to do the change as per the attached patch, hope it addresses all concerns. > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > Attachments: YARN-4700-YARN-2928.wip.patch > > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4634) Scheduler UI/Metrics need to consider cases like non-queue label mappings
[ https://issues.apache.org/jira/browse/YARN-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172403#comment-15172403 ] Wangda Tan commented on YARN-4634: -- Hi [~sunilg], Thanks for updating, however, I will hesitate to add new state {{labelToQueueMappingAvailable}} to RMNodeLabelsManager and AbstractCSQueue needs to update that state as well. I suggest to check show labels hierarchy only if: - There's label other than DEFAULT_LABEL has >0 activeNMs. And let's keep queues page as simple as possible. You can check RMNodeLabelsManager#pullRMNodeLabelsInfo for details (And NodeLabelsPage as an example). Thoughts? > Scheduler UI/Metrics need to consider cases like non-queue label mappings > - > > Key: YARN-4634 > URL: https://issues.apache.org/jira/browse/YARN-4634 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Sunil G > Attachments: 0001-YARN-4634.patch, 0002-YARN-4634.patch > > > Currently when label-queue mappings are not available, there are few > assumptions taken in UI and in metrics. > In above case where labels are enabled and available in cluster but without > any queue mappings, UI displays queues under labels. This is not correct. > Currently labels enabled check and availability of labels are considered to > render scheduler UI. Henceforth we also need to check whether > - queue-mappings are available > - nodes are mapped with labels with proper exclusivity flags on > This ticket also will try to see the default configurations in queue when > labels are not mapped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (YARN-4704) TestResourceManager#testResourceAllocation() fails when using FairScheduler
[ https://issues.apache.org/jira/browse/YARN-4704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yufei Gu reassigned YARN-4704: -- Assignee: Yufei Gu (was: Ray Chiang) > TestResourceManager#testResourceAllocation() fails when using FairScheduler > --- > > Key: YARN-4704 > URL: https://issues.apache.org/jira/browse/YARN-4704 > Project: Hadoop YARN > Issue Type: Test > Components: fairscheduler, test >Affects Versions: 2.7.2 >Reporter: Ray Chiang >Assignee: Yufei Gu > > When using FairScheduler, TestResourceManager#testResourceAllocation() fails > with the following error: > java.lang.IllegalStateException: Trying to stop a non-running task: 1 of > application application_1455833410011_0001 > at > org.apache.hadoop.yarn.server.resourcemanager.Task.stop(Task.java:117) > at > org.apache.hadoop.yarn.server.resourcemanager.Application.finishTask(Application.java:266) > at > org.apache.hadoop.yarn.server.resourcemanager.TestResourceManager.testResourceAllocation(TestResourceManager.java:167) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4736) Issues with HBaseTimelineWriterImpl
[ https://issues.apache.org/jira/browse/YARN-4736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172390#comment-15172390 ] Naganarasimha G R commented on YARN-4736: - Hi [~vrushalic], Was checking with our hbase team from the logs and the trace they were informing that the issue might be due to bad connectivity with the zookeeper, but it strange to see that in the local node setup. So i suspect that there is some issue with my hbase setup, so planning to test with hbase-1.0.3 tar. But also would like to know whether you guys are facing the same issue or its only me who is facing it ? > Issues with HBaseTimelineWriterImpl > --- > > Key: YARN-4736 > URL: https://issues.apache.org/jira/browse/YARN-4736 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Naganarasimha G R >Assignee: Vrushali C >Priority: Critical > Labels: yarn-2928-1st-milestone > Attachments: hbaseException.log, threaddump.log > > > Faced some issues while running ATSv2 in single node Hadoop cluster and in > the same node had launched Hbase with embedded zookeeper. > # Due to some NPE issues i was able to see NM was trying to shutdown, but the > NM daemon process was not completed due to the locks. > # Got some exception related to Hbase after application finished execution > successfully. > will attach logs and the trace for the same -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172364#comment-15172364 ] Naganarasimha G R commented on YARN-4700: - Oops *TimelineStorageUtils.getTopOfTheDayTimestamp()* is not called we need to push that part of it in {{FlowActivityRowKey.getRowKey(clusterId, te.getCreatedTime(), userId, flowName)}} > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172357#comment-15172357 ] Naganarasimha G R commented on YARN-4700: - Hi [~sjlee0], Based on the points from [~vrushalic] and [~varun_saxena], was creating a patch such that {{HBaseTimelineWriterImpl.storeInFlowActivityTable}} uses {{FlowActivityRowKey.getRowKey(clusterId, te.getCreatedTime(), userId, flowName)}} instead of the other overloaded method which doesn't take the timestamp. This would take care of of calling {{TimelineStorageUtils.getTopOfTheDayTimestamp()}} right ? > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4465) SchedulerUtils#validateRequest for Label check should happen only when nodelabel enabled
[ https://issues.apache.org/jira/browse/YARN-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172342#comment-15172342 ] Wangda Tan commented on YARN-4465: -- Thanks [~bibinchundatt], +1 to latest patch. Will commit tomorrow if there's no objections. > SchedulerUtils#validateRequest for Label check should happen only when > nodelabel enabled > > > Key: YARN-4465 > URL: https://issues.apache.org/jira/browse/YARN-4465 > Project: Hadoop YARN > Issue Type: Improvement >Reporter: Bibin A Chundatt >Assignee: Bibin A Chundatt >Priority: Minor > Attachments: 0001-YARN-4465.patch, 0002-YARN-4465.patch, > 0003-YARN-4465.patch, 0004-YARN-4465.patch, 0006-YARN-4465.patch, > 0007-YARN-4465.patch > > > Disable label from rm side yarn.nodelabel.enable=false > Capacity scheduler label configuration for queue is available as below > default label for queue = b1 as 3 and accessible labels as 1,3 > Submit application to queue A . > {noformat} > Caused by: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException): > Invalid resource request, queue=b1 doesn't have permission to access all > labels in resource request. labelExpression of resource request=3. Queue > labels=1,3 > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.validateResourceRequest(SchedulerUtils.java:304) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:234) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.SchedulerUtils.normalizeAndValidateRequest(SchedulerUtils.java:216) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.validateAndCreateResourceRequest(RMAppManager.java:401) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.createAndPopulateNewRMApp(RMAppManager.java:340) > at > org.apache.hadoop.yarn.server.resourcemanager.RMAppManager.submitApplication(RMAppManager.java:283) > at > org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.submitApplication(ClientRMService.java:602) > at > org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.submitApplication(ApplicationClientProtocolPBServiceImpl.java:247) > {noformat} > # Ignore default label expression when label is disabled *or* > # NormalizeResourceRequest we can set label expression to > when node label is not enabled *or* > # Improve message -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172335#comment-15172335 ] Vrushali C commented on YARN-4700: -- Hi [~sjlee0] Yes, the flow activity table's row key always needs to belong to the top of the day timestamp. But the event timestamp should be used to find out the top of that day. bq. If they meant that we would use the actual event timestamps as is for the row key, I'm not as sure. No, we can't use the event timestamp as is. It needs to be top of the day of that timestamp. Which is what I said in the previous comment, " the entry for that flow should go into THAT older day's row, hence we should use the event timestamp." You are right, the code in FlowActivityRowKey#getRowKey() needs to change to take the event timestamp, not the current time. I thought we were sending in null for the timestamp and hence using current time, but looks like it's directly using current time here. {code} long dayTs = TimelineStorageUtils.getTopOfTheDayTimestamp(System .currentTimeMillis()); {code} > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172332#comment-15172332 ] Naganarasimha G R commented on YARN-4700: - It might be mostly the case of asynchronous or, its not necessary that ATS is running initially but starts up after RM failsover. But in any case would not better to link up with asynchronous and synchronous events for V2. > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4700) ATS storage has one extra record each time the RM got restarted
[ https://issues.apache.org/jira/browse/YARN-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172304#comment-15172304 ] Sangjin Lee commented on YARN-4700: --- I may have misread the comments in haste last Friday. If the comments meant that we would use the event timestamps instead of the current time and calculate the top-of-the-day timestamps from them, then I concur. If they meant that we would use the actual event timestamps *as is* for the row key, I'm not as sure. My main concern there is it might make some of the queries we want to do against this table in the future harder or make them perform more poorly. For example, we could do a query like "return all flow activities in the last 7 days". With a top-of-the-day timestamps, it would be a simple partial row key matching. With variable timestamps, it would become more of a range query. Are my concerns overblown? If the solution we're discussing is the former, then I think it's quite straightforward. We need a little bit of change in {{FlowActivityRowKey.getRowKey()}} where we should apply {{TimelineStorageUtils.getTopOfTheDayTimestamp()}} on the provided timestamp. > ATS storage has one extra record each time the RM got restarted > --- > > Key: YARN-4700 > URL: https://issues.apache.org/jira/browse/YARN-4700 > Project: Hadoop YARN > Issue Type: Sub-task > Components: timelineserver >Affects Versions: YARN-2928 >Reporter: Li Lu >Assignee: Naganarasimha G R > Labels: yarn-2928-1st-milestone > > When testing the new web UI for ATS v2, I noticed that we're creating one > extra record for each finished application (but still hold in the RM state > store) each time the RM got restarted. It's quite possible that we add the > cluster start timestamp into the default cluster id, thus each time we're > creating a new record for one application (cluster id is a part of the row > key). We need to fix this behavior, probably by having a better default > cluster id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172264#comment-15172264 ] Arun Suresh commented on YARN-2883: --- On going thru the patch again, I feel a better (and safer) way to introduce the changes is * to extend both the {{ContainerManagerImpl}} and the {{ContainersMonitorImpl}} classes into two new subclasses {{QueuedContainerManager}} and {{QueuedContainersManager}} * Then we can move all the required data structures (all the new collections) as well as the Event Handlers into the new classes. * We can also get rid of all the {{if context.isDistributedSchedulingEnabled()}} checks. Since we would need to do it only once in the {{NodeManager}} when we create the instance of the {{ContainerManager}} * We can also reason about the code flow better since we the changes would be isolated to the new classes. > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-yarn-2877.001.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2883) Queuing of container requests in the NM
[ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172226#comment-15172226 ] Arun Suresh commented on YARN-2883: --- Friday, February 26, 2016 5:01 PM *ContainersMonitorImpl* * In {{startPendingContainers()}}, It looks like the synchronized block itself can be refactored into another function, and then you can call the function first with {{queueGuarRequests}} and then with {{queueOpportRequests}} * W.r.t TODO before {{updateNMTokenIdentifier}}.. Yup, it needs to be there for all containers* W.r.t TODO inside the CHANGE_MONITORING_CONTAINER_RESOURCE, yup, think we might have to update the available resource * Looks like {{setQueuedContainerStatus}} method is not being used Also, looks like a lot of tests have failed. Can you please verify if this is not due to changes in this patch. > Queuing of container requests in the NM > --- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Konstantinos Karanasos >Assignee: Konstantinos Karanasos > Attachments: YARN-2883-yarn-2877.001.patch > > > We propose to add a queue in each NM, where queueable container requests can > be held. > Based on the available resources in the node and the containers in the queue, > the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, > the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4745) TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing in trunk
[ https://issues.apache.org/jira/browse/YARN-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172210#comment-15172210 ] Daniel Templeton commented on YARN-4745: In fact, I see it failing in branch-2.7 and branch-2.8 as well. > TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing > in trunk > -- > > Key: YARN-4745 > URL: https://issues.apache.org/jira/browse/YARN-4745 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton > > I am consistently seeing this: > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.284 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > testPublicResourceInitializesLocalDir(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService) > Time elapsed: 1.842 sec <<< FAILURE! > org.mockito.exceptions.verification.WantedButNotInvoked: > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > Results : > Failed tests: > TestResourceLocalizationService.testPublicResourceInitializesLocalDir:1476 > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4746) yarn web services should convert parse failures of appId to 400
Steve Loughran created YARN-4746: Summary: yarn web services should convert parse failures of appId to 400 Key: YARN-4746 URL: https://issues.apache.org/jira/browse/YARN-4746 Project: Hadoop YARN Issue Type: Bug Components: webapp Affects Versions: 2.8.0 Reporter: Steve Loughran Priority: Minor I'm seeing somewhere in the WS API tests of mine an error with exception conversion of a bad app ID sent in as an argument to a GET. I know it's in ATS, but a scan of the core RM web services implies a same problem {{WebServices.parseApplicationId()}} uses {{ConverterUtils.toApplicationId}} to convert an argument; this throws IllegalArgumentException, which is then handled somewhere by jetty as a 500 error. In fact, it's a bad argument, which should be handled by returning a 400. This can be done by catching the raised argument and explicitly converting it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172101#comment-15172101 ] Colin Patrick McCabe commented on YARN-4731: Thanks for the reviews, guys. > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Colin Patrick McCabe >Priority: Blocker > Fix For: 2.9.0 > > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hado
[jira] [Updated] (YARN-4745) TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing in trunk
[ https://issues.apache.org/jira/browse/YARN-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4745: --- Issue Type: Sub-task (was: Bug) Parent: YARN-4478 > TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing > in trunk > -- > > Key: YARN-4745 > URL: https://issues.apache.org/jira/browse/YARN-4745 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton > > I am consistently seeing this: > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.284 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > testPublicResourceInitializesLocalDir(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService) > Time elapsed: 1.842 sec <<< FAILURE! > org.mockito.exceptions.verification.WantedButNotInvoked: > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > Results : > Failed tests: > TestResourceLocalizationService.testPublicResourceInitializesLocalDir:1476 > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4737) Use CSRF Filter in YARN
[ https://issues.apache.org/jira/browse/YARN-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172067#comment-15172067 ] Varun Vasudev commented on YARN-4737: - [~jmaron] - to my knowledge the only web UI that uses the web services call via javascript is the Tez UI. However there is a branch to change the RM UI to use javascript and web services as well. > Use CSRF Filter in YARN > --- > > Key: YARN-4737 > URL: https://issues.apache.org/jira/browse/YARN-4737 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, webapp >Reporter: Jonathan Maron >Assignee: Jonathan Maron > Attachments: YARN-4737.patch.001 > > > A CSRF filter was added to hadoop common > (https://issues.apache.org/jira/browse/HADOOP-12691). The aim of this JIRA > is to come up with a mechanism to integrate this filter into the webapps for > which it is applicable (web apps that may establish an authenticated > identity). That includes the RM, NM, and mapreduce jobhistory web app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (YARN-4737) Use CSRF Filter in YARN
[ https://issues.apache.org/jira/browse/YARN-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172067#comment-15172067 ] Varun Vasudev edited comment on YARN-4737 at 2/29/16 4:11 PM: -- [~jmaron] - to my knowledge the only web UI that uses the web services call via javascript is the Tez UI. There is a branch to change the RM UI to use javascript and web services as well. However, all of these should be using GET calls only so I suspect they won't be affected by this change. was (Author: vvasudev): [~jmaron] - to my knowledge the only web UI that uses the web services call via javascript is the Tez UI. However there is a branch to change the RM UI to use javascript and web services as well. > Use CSRF Filter in YARN > --- > > Key: YARN-4737 > URL: https://issues.apache.org/jira/browse/YARN-4737 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, webapp >Reporter: Jonathan Maron >Assignee: Jonathan Maron > Attachments: YARN-4737.patch.001 > > > A CSRF filter was added to hadoop common > (https://issues.apache.org/jira/browse/HADOOP-12691). The aim of this JIRA > is to come up with a mechanism to integrate this filter into the webapps for > which it is applicable (web apps that may establish an authenticated > identity). That includes the RM, NM, and mapreduce jobhistory web app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172011#comment-15172011 ] Varun Saxena commented on YARN-3863: Checkstyle issues are related to imports made due to javadoc. > Support complex filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-YARN-2928.v2.01.patch, > YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, > YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172016#comment-15172016 ] Hudson commented on YARN-4731: -- SUCCESS: Integrated in Hadoop-trunk-Commit #9392 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9392/]) YARN-4731. container-executor should not follow symlinks in (jlowe: rev c58a6d53c58209a8f78ff64e04e9112933489fb5) * hadoop-yarn-project/CHANGES.txt * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/test/test-container-executor.c * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Colin Patrick McCabe >Priority: Blocker > Fix For: 2.9.0 > > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 contain
[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15172005#comment-15172005 ] Varun Saxena commented on YARN-3863: Latest patch fixes the comments given by Sangjin. # I have split TestHBaseTimelineStorage into TestHBaseTimelineStorageApps and TestHBaseTimelineStorageEntities. Also moved the part related to loading apps and entities to a separate class. # Methods to match filters in TimelineStorageUtils have been refactored for better readability and to club common code together. Have passed on enum to decide which filter we are trying to match. Suggestions for better name of this enum ? # For separation of logic between Generic and Application entity reader, I have refactored methods which were exclusively in GenericEntityReader but were being used by ApplicationEntityReader. Now relevant logic will exist both the classes(with EntityColumnPrefix used in GenericEntityReader and ApplicationColumnPrefix used in ApplicationEntityReader). Have tried to move some of the common code used in these methods to utils classes. # I have also moved previously written methods in GenericEntityReader to read relations, events,etc. to TimelineStorageUtils. These methods were being used by ApplicationEntityReader as well in addition to GenericEntityReader. # Refactored createSingleColValueFiltersByRange() and createHBaseSingleColValueFilter() so that createSingleColValueFiltersByRange can call createHBaseSingleColValueFilter(). # Fixed javadoc related comments and made members final in the classes pointed out. # Removed preconditions check for filters not being null. Now if filters are null, I create TimelineEntityFilters object with default values in augmentParams. # Used == instead of equals to match enums. # Changed name of TimelineEqualityFilter and TimelineMultiValEqualityFilter to TimelineKeyValueFilter and TimelineKeyValuesFilter respectively. # There was a comment on why getCompoundColQualBytes is being used. I had missed using it in previous patch. It is to be used for events. If say event to be fetched is UPDATE_APP with associated info as info1 as key, ts as timestamp and val1 as value, then the column is of the form {{e!UPDATE_APP=ts=info1}}. This kind of column within code is referred to as a compound column. The part after the prefix {{e!}} i.e part with {{=}} as separator is what we want to construct with getCompoundColQualBytes. So, if we have to match event filters, the qualifier filter will have to be matched with prefix {{e!UPDATE_APP=}}. The getCompoundColQualBytes bit for event filters comes in here. # For the comments for TimelineReaderWebServicesUtils.java, kindy refer to the explanation in the comment in which I have detailed what I have done. The changes here are required. I could not quite get below comment. I did not make any change on line 448. Sangjin, can you elaborate. Maybe you meant some other line. {quote} (HBaseTimelineWriterImpl.java) l.448: it should simply be a else if {quote} cc [~sjlee0], [~djp] As the patch is quite big, refer to comment https://issues.apache.org/jira/browse/YARN-3863?focusedCommentId=15169833&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15169833 to get details of what has been implemented. > Support complex filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-YARN-2928.v2.01.patch, > YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, > YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4737) Use CSRF Filter in YARN
[ https://issues.apache.org/jira/browse/YARN-4737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Maron updated YARN-4737: - Attachment: YARN-4737.patch.001 The key elements of the uploaded patch: - Provides a CSRF enabling call to WebApps.Builder, taking the configuration prefix as an argument. - Adds the call to web apps currently capable of an SPNEGO authentication (and thus susceptible to CSRF) - RM, NM, and Job History - Defines the properties associated with configuration of the filter for these given web apps - Tests added based on TestRMWebServices (used the test as an example of client invocations of RM web endpoint) NOTE: Could use some assistance in ascertaining whether web apps currently have javascript invocations of the exposed REST services. Those calls will fail if CSRF is enabled. > Use CSRF Filter in YARN > --- > > Key: YARN-4737 > URL: https://issues.apache.org/jira/browse/YARN-4737 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager, resourcemanager, webapp >Reporter: Jonathan Maron >Assignee: Jonathan Maron > Attachments: YARN-4737.patch.001 > > > A CSRF filter was added to hadoop common > (https://issues.apache.org/jira/browse/HADOOP-12691). The aim of this JIRA > is to come up with a mechanism to integrate this filter into the webapps for > which it is applicable (web apps that may establish an authenticated > identity). That includes the RM, NM, and mapreduce jobhistory web app. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4731) container-executor should not follow symlinks in recursive_unlink_children
[ https://issues.apache.org/jira/browse/YARN-4731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171975#comment-15171975 ] Jason Lowe commented on YARN-4731: -- Thanks for catching the vulnerability, Colin! +1 lgtm. Committing this. > container-executor should not follow symlinks in recursive_unlink_children > -- > > Key: YARN-4731 > URL: https://issues.apache.org/jira/browse/YARN-4731 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt >Assignee: Colin Patrick McCabe >Priority: Blocker > Attachments: YARN-4731.001.patch, YARN-4731.002.patch > > > Enable LCE and CGroups > Submit a mapreduce job > {noformat} > 2016-02-24 18:56:46,889 INFO > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting > absolute path : > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > 2016-02-24 18:56:46,894 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 255. Privileged Execution Operation > Output: > main : command provided 3 > main : run as user is dsperf > main : requested yarn user is dsperf > failed to rmdir job.jar: Not a directory > Error while deleting > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01: > 20 (Not a directory) > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > dsperf, dsperf, 3, > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01] > 2016-02-24 18:56:46,894 ERROR > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: > DeleteAsUser for > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/dsperf/appcache/application_1456319010019_0003/container_e02_1456319010019_0003_01_01 > returned with exit code: 255 > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=255: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:199) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:569) > at > org.apache.hadoop.yarn.server.nodemanager.DeletionService$FileDeletionTask.run(DeletionService.java:265) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=255: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 10 more > {noformat} > As a result nodemanager-local directory are not getting deleted for each > application > {noformat} > total 36 > drwxr-s--- 4 hdfs hadoop 4096 Feb 25 08:25 ./ > drwxr-s--- 7 hdfs hadoop 4096 Feb 25 08:25 ../ > -rw--- 1 hdfs hadoop 340 Feb 25 08:25 container_tokens > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.jar -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/11/job.jar/ > lrwxrwxrwx 1 hdfs hadoop 111 Feb 25 08:25 job.xml -> > /opt/bibin/dsperf/HAINSTALL/nmlocal/usercache/hdfs/appcache/application_1456364845478_0004/filecache/13/job.xml* > drwxr-s--- 2 hdfs hadoop 4096 Feb 25 08:25 jobSubmitDir/ > -rwx-- 1 hdfs hadoop 5348 Feb 25 08:25 launch_container.sh* > drwxr-s--- 2 hdfs hadoop 409
[jira] [Commented] (YARN-4745) TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing in trunk
[ https://issues.apache.org/jira/browse/YARN-4745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171951#comment-15171951 ] Daniel Templeton commented on YARN-4745: Also failing in branch-2. > TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing > in trunk > -- > > Key: YARN-4745 > URL: https://issues.apache.org/jira/browse/YARN-4745 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 2.9.0 >Reporter: Daniel Templeton > > I am consistently seeing this: > {noformat} > --- > T E S T S > --- > Running > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.284 sec <<< > FAILURE! - in > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService > testPublicResourceInitializesLocalDir(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService) > Time elapsed: 1.842 sec <<< FAILURE! > org.mockito.exceptions.verification.WantedButNotInvoked: > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > Results : > Failed tests: > TestResourceLocalizationService.testPublicResourceInitializesLocalDir:1476 > Wanted but not invoked: > localFs.mkdir( > > /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, > rwxr-xr-x, > true > ); > -> at > org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) > However, there were other interactions with this mock: > -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-4745) TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing in trunk
Daniel Templeton created YARN-4745: -- Summary: TestResourceLocalizationService.testPublicResourceInitializesLocalDir failing in trunk Key: YARN-4745 URL: https://issues.apache.org/jira/browse/YARN-4745 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.9.0 Reporter: Daniel Templeton I am consistently seeing this: {noformat} --- T E S T S --- Running org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 2.284 sec <<< FAILURE! - in org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService testPublicResourceInitializesLocalDir(org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService) Time elapsed: 1.842 sec <<< FAILURE! org.mockito.exceptions.verification.WantedButNotInvoked: Wanted but not invoked: localFs.mkdir( /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, rwxr-xr-x, true ); -> at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) However, there were other interactions with this mock: -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) Results : Failed tests: TestResourceLocalizationService.testPublicResourceInitializesLocalDir:1476 Wanted but not invoked: localFs.mkdir( /Users/daniel/NetBeansProjects/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService/0/filecache, rwxr-xr-x, true ); -> at org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.TestResourceLocalizationService.testPublicResourceInitializesLocalDir(TestResourceLocalizationService.java:1476) However, there were other interactions with this mock: -> at org.apache.hadoop.fs.FileContext.(FileContext.java:249) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) -> at org.apache.hadoop.fs.FileContext.makeQualified(FileContext.java:611) Tests run: 1, Failures: 1, Errors: 0, Skipped: 0 {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3863: --- Attachment: (was: YARN-3863-YARN-2928.v2.03.patch) > Support complex filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-YARN-2928.v2.01.patch, > YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, > YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171910#comment-15171910 ] Hadoop QA commented on YARN-3863: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 52s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 18s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 19s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 26s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s {color} | {color:green} YARN-2928 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} YARN-2928 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 19s {color} | {color:green} YARN-2928 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 13s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice: patch generated 5 new + 2 unchanged - 1 fixed = 7 total (was 3) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 24s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 12s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 53s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 57s {color} | {color:green} hadoop-yarn-server-timelineservice in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 24m 8s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790480/YARN-3863-YARN-2928.v2.03.patch | | JIRA Issue | YARN-3863 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 8341224cdf9f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | B
[jira] [Commented] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171907#comment-15171907 ] Hadoop QA commented on YARN-4715: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 31s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 7s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 32s {color} | {color:green} YARN-3926 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} YARN-3926 passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s {color} | {color:green} YARN-3926 passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 49s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 49s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 33s {color} | {color:red} hadoop-yarn-project/hadoop-yarn: patch generated 3 new + 228 unchanged - 3 fixed = 231 total (was 231) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 0s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflice
[jira] [Updated] (YARN-3863) Support complex filters in TimelineReader
[ https://issues.apache.org/jira/browse/YARN-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Saxena updated YARN-3863: --- Attachment: YARN-3863-YARN-2928.v2.03.patch > Support complex filters in TimelineReader > - > > Key: YARN-3863 > URL: https://issues.apache.org/jira/browse/YARN-3863 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: YARN-2928 >Reporter: Varun Saxena >Assignee: Varun Saxena > Labels: yarn-2928-1st-milestone > Attachments: YARN-3863-YARN-2928.v2.01.patch, > YARN-3863-YARN-2928.v2.02.patch, YARN-3863-YARN-2928.v2.03.patch, > YARN-3863-YARN-2928.v2.03.patch, YARN-3863-feature-YARN-2928.wip.003.patch, > YARN-3863-feature-YARN-2928.wip.01.patch, > YARN-3863-feature-YARN-2928.wip.02.patch, > YARN-3863-feature-YARN-2928.wip.04.patch, > YARN-3863-feature-YARN-2928.wip.05.patch > > > Currently filters in timeline reader will return an entity only if all the > filter conditions hold true i.e. only AND operation is supported. We can > support OR operation for the filters as well. Additionally as primary backend > implementation is HBase, we can design our filters in a manner, where they > closely resemble HBase Filters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4715) Add support to read resource types from a config file
[ https://issues.apache.org/jira/browse/YARN-4715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Varun Vasudev updated YARN-4715: Attachment: YARN-4715-YARN-3926.004.patch After some further thought, I think it's just easier to dis-allow specifying memory and vcores as resource types. Uploaded a new patch with the fix > Add support to read resource types from a config file > - > > Key: YARN-4715 > URL: https://issues.apache.org/jira/browse/YARN-4715 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Reporter: Varun Vasudev >Assignee: Varun Vasudev > Attachments: YARN-4715-YARN-3926.001.patch, > YARN-4715-YARN-3926.002.patch, YARN-4715-YARN-3926.003.patch, > YARN-4715-YARN-3926.004.patch > > > This ticket is to add support to allow the RM to read the resource types to > be used for scheduling from a config file. I'll file follow up tickets to add > similar support in the NM as well as to handle the RM-NM handshake protocol > issues. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4566) TestMiniYarnClusterNodeUtilization sometimes fails on trunk
[ https://issues.apache.org/jira/browse/YARN-4566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15171828#comment-15171828 ] Takashi Ohnishi commented on YARN-4566: --- Thank you [~rohithsharma] for reviewing and committing !! > TestMiniYarnClusterNodeUtilization sometimes fails on trunk > --- > > Key: YARN-4566 > URL: https://issues.apache.org/jira/browse/YARN-4566 > Project: Hadoop YARN > Issue Type: Sub-task > Components: test >Reporter: Takashi Ohnishi >Assignee: Takashi Ohnishi > Fix For: 2.9.0 > > Attachments: YARN-4566.1.patch > > > TestMiniYarnClusterNodeUtilization often fails with NPE. > {code} > testUpdateNodeUtilization(org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization) > Time elapsed: 3.752 sec <<< ERROR! > java.lang.NullPointerException: null > at > org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.verifySimulatedUtilization(TestMiniYarnClusterNodeUtilization.java:217) > at > org.apache.hadoop.yarn.server.TestMiniYarnClusterNodeUtilization.testUpdateNodeUtilization(TestMiniYarnClusterNodeUtilization.java:116) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4744) Too many signal to container failure in case of LCE
[ https://issues.apache.org/jira/browse/YARN-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4744: --- Affects Version/s: 2.9.0 > Too many signal to container failure in case of LCE > --- > > Key: YARN-4744 > URL: https://issues.apache.org/jira/browse/YARN-4744 > Project: Hadoop YARN > Issue Type: Bug >Affects Versions: 2.9.0 >Reporter: Bibin A Chundatt > > Enable LCE with cgroups > Start server with dsperf user > Submit application with user yarn > Too many signal to container failure > {noformat} > 2014-03-01 14:10:32,223 INFO > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime: > Using container runtime: DefaultLinuxContainerRuntime > 2014-03-01 14:10:32,228 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor: > Shell execution returned exit code: 9. Privileged Execution Operation Output: > main : command provided 2 > main : run as user is yarn > main : requested yarn user is yarn > Full command array for failed execution: > [/opt/bibin/dsperf/HAINSTALL/install/hadoop/nodemanager/bin/container-executor, > yarn, yarn, 2, 28575, 15] > 2014-03-01 14:10:32,228 WARN > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime: > Signal container failed. Exception: > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException: > ExitCodeException exitCode=9: > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:173) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DefaultLinuxContainerRuntime.signalContainer(DefaultLinuxContainerRuntime.java:132) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DelegatingLinuxContainerRuntime.signalContainer(DelegatingLinuxContainerRuntime.java:109) > at > org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.signalContainer(LinuxContainerExecutor.java:513) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:520) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:139) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:55) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:184) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:110) > at java.lang.Thread.run(Thread.java:745) > Caused by: ExitCodeException exitCode=9: > at org.apache.hadoop.util.Shell.runCommand(Shell.java:927) > at org.apache.hadoop.util.Shell.run(Shell.java:838) > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:150) > ... 9 more > {noformat} > Checked the same scenario in 2.7.2 version (not available) -- This message was sent by Atlassian JIRA (v6.3.4#6332)