[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317398#comment-15317398 ] Hudson commented on MAPREDUCE-5044: --- SUCCESS: Integrated in Hadoop-trunk-Commit #9915 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9915/]) MAPREDUCE-5044. Have AM trigger jstack on task attempts that timeout (mingma: rev 4a1cedc010d3fa1d8ef3f2773ca12acadfee5ba5) * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestSignalContainer.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/YarnClient.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/applicationclient_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestRPC.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/NodeManager.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/main/java/org/apache/hadoop/mapred/ResourceMgrDelegate.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/protocolrecords/SignalContainerResponse.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/ApplicationCLI.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerResourceIncreaseRPC.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/MockResourceManagerFacade.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/ContainerManagerImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ClientRMService.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ContainerManagementProtocolPBServiceImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestApplicationMasterLauncher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ApplicationClientProtocol.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/client/ApplicationClientProtocolPBClientImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/test/java/org/apache/hadoop/yarn/client/api/impl/TestYarnClient.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/ContainerManagementProtocolPB.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/TestAMAuthorization.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/MockRM.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/proto/containermanagement_protocol.proto * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/api/impl/YarnClientImpl.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/launcher/ContainerLauncherEvent.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapreduce/v2/app/job/impl/TaskAttemptImpl.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/api/ContainerManagementProtocol.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app/src/main/java/org/apache/hadoop/mapred/LocalContainerLauncher.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/TestContainerLaunchRPC.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/v2/TestMRJobs.java * hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapred/TestClientRedirect.java * hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/api/impl/pb/service/ApplicationClientProtocolPBServiceImpl.java *
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317343#comment-15317343 ] Ming Ma commented on MAPREDUCE-5044: I have committed the patch to trunk, branch-2 and branch-2.8. Thank you [~eepayne] and [~jira.shegalov] for the contribution and [~vinodkv] [~jlowe] and [~aw] for the review. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.013.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317290#comment-15317290 ] Eric Payne commented on MAPREDUCE-5044: --- bq. The patch doesn't resolve automatically for branch-2 and 2.8. It is straightforward and I will resolve it for those two branches. [~mingma], I did see that, but I was hoping it was straightforward enough that it didn't need a separate patch. Thanks for doing the extra work for the cherry-pick. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.013.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15317271#comment-15317271 ] Ming Ma commented on MAPREDUCE-5044: +1 on the latest patch. Thanks [~eepayne]. The patch doesn't resolve automatically for branch-2 and 2.8. It is straightforward and I will resolve it for those two branches. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.013.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316754#comment-15316754 ] Eric Payne commented on MAPREDUCE-5044: --- - FindBugs warning is not related. It pertains to {{org.apache.hadoop.yarn.api.records.ResourceRequest}} / {{ResourceRequest.java:[line 361]}}, which was not changed by this patch. - Checkstyle warnings are as I expected (see my comment, above). - Unit test failures all pass in my local environment for {{TestLogsCLI}}, which intermittently fails both with and without this patch, and {{TestYarnClient}}, which fails consistently both with and without the patch. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.013.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15315159#comment-15315159 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 14 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 56s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 32s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 59s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 11s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 31s {color} | {color:red} root: The patch generated 4 new + 881 unchanged - 3 fixed = 885 total (was 884) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 7s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 48s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 35m 27s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 45s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 21s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 12s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 168m 29s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.server.resourcemanager.TestAMAuthorization | |
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15313187#comment-15313187 ] Eric Payne commented on MAPREDUCE-5044: --- Thanks [~aw]. I will look into those warnings. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312896#comment-15312896 ] Allen Wittenauer commented on MAPREDUCE-5044: - Some (more) of those checkstyle errors should be fixed. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15312847#comment-15312847 ] Eric Payne commented on MAPREDUCE-5044: --- I looked at the unit test failures from the pre-commit build. The all succeed in my local build environment except for TestYarnClient, which fails intermittently in trunk, both with and without this patch. [~mingma], when you have some time, please have a look at the latest patch. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.012.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15311670#comment-15311670 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 14 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 15s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 3s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 39s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 45s {color} | {color:red} root: The patch generated 25 new + 879 unchanged - 3 fixed = 904 total (was 882) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 20s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 46s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 32m 36s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 8s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 9m 20s {color} | {color:red} hadoop-mapreduce-client-app in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 115m 30s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 297m 21s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens | | | hadoop.yarn.client.cli.TestLogsCLI | | | hadoop.yarn.client.TestGetGroups
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304512#comment-15304512 ] Eric Payne commented on MAPREDUCE-5044: --- Of the unit tests that failed in the precommit build, all pass for me in my local build environment except for 2: - {{TestMiniMRChildTask}} fails in trunk with or without {{MAPREDUCE-5044.011.patch}} - {{TestUberAM}} succeeds in trunk and fails with {{MAPREDUCE-5044.011.patch}}. This is because {{TestUberAM}} extends {{TestMRJobs}}, to which I added the test {{testThreadDumpOnTaskTimeout}}. {{TestMRJobs#testThreadDumpOnTaskTimeout}} is having issues. I will fix and upload a new patch. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.v01.patch, > MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, > MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15303070#comment-15303070 ] Gera Shegalov commented on MAPREDUCE-5044: -- Thanks for picking up this JIRA [~eepayne], assigning it to you! > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Eric Payne > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.011.patch, MAPREDUCE-5044.v01.patch, > MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, > MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300910#comment-15300910 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 14 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 25s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 42s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 24s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 41s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 12s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 34s {color} | {color:red} root: patch generated 22 new + 873 unchanged - 3 fixed = 895 total (was 876) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 24s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 8s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 25s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api generated 6 new + 5406 unchanged - 0 fixed = 5412 total (was 5406) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 27s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 13s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 14s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 30m 9s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 7s {color} | {color:red} hadoop-yarn-client in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 33s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 127m 42s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 303m 27s {color} | {color:black} {color} | \\ \\ || Reason ||
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15296554#comment-15296554 ] Ming Ma commented on MAPREDUCE-5044: Thanks [~eepayne]. Besides the checkstyle, whitespace and javadoc issues, * There is some commented-out code left after the function is moved to {{internalSignalToContainer}}. * Given {{signalContainer}} is renamed to {{signalToContainer}} for ContainerManagementProtocol, maybe better to fix that for ApplicationClientProtocol as well, as long as we agree to include this patch in 2.8. Otherwise, it looks good overall. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295703#comment-15295703 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 9 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 31s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 45s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 26s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s {color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 32s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 26s {color} | {color:red} root: patch generated 21 new + 537 unchanged - 0 fixed = 558 total (was 537) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 8s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 3 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 13s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 0s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api generated 6 new + 5406 unchanged - 0 fixed = 5412 total (was 5406) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 23s {color} | {color:green} hadoop-yarn-api in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 8s {color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 11m 13s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 29m 29s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 16s {color} | {color:green} hadoop-mapreduce-client-app in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 100m 11s {color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 201m 15s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestAMAuthorization | | | hadoop.yarn.server.resourcemanager.TestClientRMTokens |
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295174#comment-15295174 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 3m 16s {color} | {color:red} Docker failed to build yetus/hadoop:2c91fd8. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12805487/MAPREDUCE-5044.010.patch | | JIRA Issue | MAPREDUCE-5044 | | Console output | https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/6523/console | | Powered by | Apache Yetus 0.2.0 http://yetus.apache.org | This message was automatically generated. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.010.patch, MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292038#comment-15292038 ] Ming Ma commented on MAPREDUCE-5044: bq. In that case, do we want to call it something like signalsToContainers? Sounds good. signalsToContainers can take an array of {{SignalContainerRequest}}, each of which has a list of commands belonging to the same container. When we decide to add signalsToContainers later, deprecate signalToContainer and NM will still support signalToContainer until major upgrade. In that way, we don't need to fix {{required}} issue given only new signalsToContainers method will use list-based {{SignalContainerRequest}}. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15291977#comment-15291977 ] Eric Payne commented on MAPREDUCE-5044: --- [~mingma], thank you for your reply and explanation. {quote} - signalContainers was initially suggested as an ordered list of signalContainer. So it could include requests from the same container or requests from different containers. It is true that the only use case we know of so far is to include requests from the same container. {quote} In that case, do we want to call it something like {{signalsToContainers}}? I'm open for ideas. {quote} - Will the required in the protocol buffer definition create any issue if we do rolling upgrade from 2.8 to 2.9 and the 2.9 MR AM might send a list of SignalContainerCommandProto to 2.8 NM? Maybe 2.8 NM just discards the message, not a big deal. Regardless, that is a separate issue that we don't need to address it here. {quote} Yes, this is a concern and something we need to look into more deeply and keep in mind. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290193#comment-15290193 ] Ming Ma commented on MAPREDUCE-5044: [~eepayne] I agree with your suggestion. Let us postpone it to a later time. * {{signalContainers}} was initially suggested as an ordered list of {{signalContainer}}. So it could include requests from the same container or requests from different containers. It is true that the only use case we know of so far is to include requests from the same container. * We also discussed introducing other commands besides linux signal, for example sleep command used to pause between signals, in that way, the new API could be just like {noformat} public static SignalContainerRequest newInstance(ContainerId containerId, Iterable signals) { ... } {noformat} * Will the {{required}} in the protocol buffer definition create any issue if we do rolling upgrade from 2.8 to 2.9 and the 2.9 MR AM might send a list of SignalContainerCommandProto to 2.8 NM? Maybe 2.8 NM just discards the message, not a big deal. Regardless, that is a separate issue that we don't need to address it here. {noformat} message SignalContainerRequestProto { required SignalContainerCommandProto command = 2; } {noformat} > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15289975#comment-15289975 ] Eric Payne commented on MAPREDUCE-5044: --- [~mingma], thank you very much for the comments. I have one question: {quote} - ... it might be useful to rename signalContainer to signalContainers so that we don't need to modify the API later, which means some new structure like SignalContainersRequest. What is your take? {quote} I would rather not rename {{signalContainer}} to {{signalContainers}} because {{signalContainers}} sounds to me like the purpose is to send one signal to multiple containers rather than to send multiple signals to one container. Calling it {{signalsContainer}} (plural {{signals}}) also sounds awkward. So, I think {{signalContainer}} is the best option. Regarding {{SignalContainerRequest}}, if we want the {{signalContainer}} API to be fully compatible with sending multiple signals, I think {{SignalContainerRequest}} would need to add an interface for {{SignalContainerRequest#newInstance}} that included both pause and a list of signals. Maybe something like this: {code} public static SignalContainerRequest newInstance(ContainerId containerId, int pause, Iterable signals) { ... } {code} I think it would be best to add that interface to {{SignalContainerRequest}} in the future when we are ready to implement the rest of the "sending multiple signals" feature. Thoughts? > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285894#comment-15285894 ] Ming Ma commented on MAPREDUCE-5044: [~eepayne], my apologies for the delay. * There was some discussion about combining signalContainer and stopContainers so that stopContainer is just a special case for signalContainer. And to support the "SIGTERM + delay + SIGKILL" used in stopContainers, we then need an ordered list of commands, thus the need for signalContainers. We don't need to deal with that at this point. But it might be useful to rename signalContainer to signalContainers so that we don't need to modify the API later, which means some new structure like {{SignalContainersRequest}}. What is your take? * ContainerManagerImpl. It might be cleaner to abstract the common signal container code to a function used for both {{AM -> NM}} and {{RM -> NM}} cases. * TaskAttemptImpl#PreemptedTransition. Given it is called only when the attempt is preempted, {{event.getType() == TaskAttemptEventType.TA_TIMED_OUT}} can be replaced by {{false}}. * It will be useful to add an end-to-end new unit test, which can be found in Gera's original patch. * Nit: ContainerLauncherImpl. Return value of {{getContainerManagementProtocol().signalContainer}} isn't used and can be removed. * Nit: ContainerLauncherEvent has indent format issue. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.008.patch, MAPREDUCE-5044.009.patch, > MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, > MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, > MAPREDUCE-5044.v07.local.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, > Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15285610#comment-15285610 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 43s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 40s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 10s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 20s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 38s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 44s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 12s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 14s {color} | {color:red} hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common in trunk has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s {color} | {color:green} trunk passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 48s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 9s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 9m 46s {color} | {color:red} root-jdk1.8.0_91 with JDK v1.8.0_91 generated 1 new + 10 unchanged - 1 fixed = 11 total (was 11) {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 45s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 7m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 45s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 35s {color} | {color:red} root: patch generated 2 new + 496 unchanged - 0 fixed = 498 total (was 496) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 2s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 3m 42s {color} | {color:red} hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdk1.8.0_91 with JDK v1.8.0_91 generated 6 new + 5406 unchanged - 0 fixed = 5412 total (was 5406) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 31s {color} | {color:green} the patch passed with JDK v1.8.0_91 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 29s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_91.
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15246947#comment-15246947 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 50s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 49s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 36s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 13s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 49s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 1s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 21s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 10s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 46s {color} | {color:green} trunk passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 26s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 9m 20s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 9m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 9m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 52s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 8m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 8m 52s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 22s {color} | {color:red} root: patch generated 1 new + 324 unchanged - 0 fixed = 325 total (was 324) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 47s {color} | {color:green} the patch passed with JDK v1.8.0_77 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 27s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 33s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_77. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 23s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_77. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 11s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198447#comment-15198447 ] Eric Payne commented on MAPREDUCE-5044: --- [~mingma], [~xgong], [~jlowe], [~jira.shegalov], did you have a chance to look at this patch? I would really appreciate some feedback. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch, Screen Shot > 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156240#comment-15156240 ] Hadoop QA commented on MAPREDUCE-5044: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 8 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 35s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 46s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 54s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 7s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 35s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 8s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 5s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 31s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 5s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 44s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 44s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 9s {color} | {color:red} root: patch generated 1 new + 330 unchanged - 0 fixed = 331 total (was 330) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 33s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 22s {color} | {color:green} hadoop-yarn-api in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 55s {color} | {color:green} hadoop-yarn-common in the patch passed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 8m 48s {color} | {color:green} hadoop-yarn-server-nodemanager in the patch passed with JDK v1.8.0_72. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 70m 52s {color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed with JDK
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154866#comment-15154866 ] Eric Payne commented on MAPREDUCE-5044: --- Thanks, [~jira.shegalov]. Would it be okay if I upmerged {{MAPREDUCE-5044.v06.patch}} and integrated it with the {{SignalContainerRequest}} that was added as part of YARN-445 and its children? > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen > Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15144582#comment-15144582 ] Eric Payne commented on MAPREDUCE-5044: --- Hi [~jira.shegalov]. I would like to see this functionality implemented. We occasionally see containers time out, and it would be good if users could have direct feedback in the form of a jstack to help them debug their applications. I have been coming up to speed on the work that's already been committed in this area under YARN-445 and its children. IIUC, YARN-445 and its children put in place the infrastructure for a {{Client -> RM -> NM -> Container}} signal path. On the other hand, this JIRA (along with YARN-1515) implements an {{AM -> NM -> Container}} signal path and the ability to send multiple signals per call. It seems that these pieces could possibly be split into separate JIRAs. Either way, I think that a lot of what has been done in this JIRA could be used to add the interface to {{ContainerManagementProtocol}} that would allow the AM to prompt the NM to signal the container to dump its stack prior to killing the container on a timeout. Is there a possibility that this JIRA will move forward? Ideally, we would like it all ported back to 2.7. Please let me know if there's anything I can do. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen > Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15145291#comment-15145291 ] Gera Shegalov commented on MAPREDUCE-5044: -- Hi [~eepayne], I am glad that we are finally picking it up. These thread dumps along with xprof has made debugging at Twitter so easy. Often you see what's wrong even without looking at the user code. Unfortunately, organizationally, Hadoop is not my current focus (it's been 2+ years since I posted my patch) . I am sure somebody from @TwitterHadoop will help move it along. > Have AM trigger jstack on task attempts that timeout before killing them > > > Key: MAPREDUCE-5044 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: mr-am >Affects Versions: 2.1.0-beta >Reporter: Jason Lowe >Assignee: Gera Shegalov > Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, > MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, > MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen > Shot 2013-11-12 at 1.06.04 PM.png > > > When an AM expires a task attempt it would be nice if it triggered a jstack > output via SIGQUIT before killing the task attempt. This would be invaluable > for helping users debug their hung tasks, especially if they do not have > shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14307791#comment-14307791 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645521/MAPREDUCE-5044.v06.patch against trunk revision c4980a2. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/5164//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051097#comment-14051097 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12645521/MAPREDUCE-5044.v06.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4706//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993851#comment-13993851 ] Jason Lowe commented on MAPREDUCE-5044: --- Patch looks good, I just have a minor comment. Rather than add dumpThreads to ContainerLauncherEvent there should be a ContainerRemoteCleanupEvent to hold the fields specific to the cleanup event, just like there's a ContainerRemoteLaunchEvent to hold the fields specific to the remote launch event. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13994400#comment-13994400 ] Gera Shegalov commented on MAPREDUCE-5044: -- Thanks for reviewing, Jason. ContainerRemoteCleanupEvent remark belongs to YARN-1515 as well, and is addressed there. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13955652#comment-13955652 ] Ming Ma commented on MAPREDUCE-5044: This is quite useful. Can we get this and YARN-1515 in 2.4.0 release? Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13910771#comment-13910771 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12630788/MAPREDUCE-5044.v04.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4367//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, MAPREDUCE-5044.v04.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852666#comment-13852666 ] Gera Shegalov commented on MAPREDUCE-5044: -- Hi [~vinodkv], thanks for chiming in. Please review MAPREDUCE-5044 and YARN-1515. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13852669#comment-13852669 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619497/MAPREDUCE-5044.v03.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4269//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850969#comment-13850969 ] Gera Shegalov commented on MAPREDUCE-5044: -- Our patch does not depend on YARN-445. In the specific scenario of a task timeout there is no need for an extra RPC. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1.4#6159)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826880#comment-13826880 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614676/Screen%20Shot%202013-11-12%20at%201.06.04%20PM.png against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4215//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Attachments: MAPREDUCE-5044.v01.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13826967#comment-13826967 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614688/MAPREDUCE-5044.v01.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 9 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-app hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager: org.apache.hadoop.mapreduce.v2.app.TestRMContainerAllocator {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4216//testReport/ Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4216//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Attachments: MAPREDUCE-5044.v01.patch When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
[ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13827179#comment-13827179 ] Hadoop QA commented on MAPREDUCE-5044: -- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614748/Screen%20Shot%202013-11-12%20at%201.06.04%20PM.png against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-MAPREDUCE-Build/4218//console This message is automatically generated. Have AM trigger jstack on task attempts that timeout before killing them Key: MAPREDUCE-5044 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044 Project: Hadoop Map/Reduce Issue Type: Improvement Components: mr-am Affects Versions: 2.1.0-beta Reporter: Jason Lowe Assignee: Gera Shegalov Attachments: MAPREDUCE-5044.v01.patch, Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png When an AM expires a task attempt it would be nice if it triggered a jstack output via SIGQUIT before killing the task attempt. This would be invaluable for helping users debug their hung tasks, especially if they do not have shell access to the nodes. -- This message was sent by Atlassian JIRA (v6.1#6144)