[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
[ https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-330: Attachment: YARN-330-1.patch Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown - Key: YARN-330 URL: https://issues.apache.org/jira/browse/YARN-330 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Hitesh Shah Assignee: Sandy Ryza Attachments: org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, YARN-330-1.patch, YARN-330.patch =Seems to be timing related as the container status RUNNING as returned by the ContainerManager does not really indicate that the container task has been launched. Sleep of 5 seconds is not reliable. Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec FAILURE! testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown) Time elapsed: 9283 sec FAILURE! junit.framework.AssertionFailedError: Did not find sigterm message at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162) Logs: 2013-01-09 14:13:08,401 INFO [AsyncDispatcher event handler] container.Container (ContainerImpl.java:handle(835)) - Container container_0__01_00 transitioned from NEW to LOCALIZING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh transitioned from INIT to DOWNLOADING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(521)) - Created localizer for container_0__01_00 2013-01-09 14:13:08,589 INFO [LocalizerRunner for container_0__01_00] localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(895)) - Writing credentials to the nmPrivate file hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens. Credentials list: 2013-01-09 14:13:08,628 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user nobody 2013-01-09 14:13:08,709 INFO [main] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000, 2013-01-09 14:13:08,781 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens to hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
[ https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated YARN-330: - Fix Version/s: (was: 3.0.0) Affects Version/s: (was: 3.0.0) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown - Key: YARN-330 URL: https://issues.apache.org/jira/browse/YARN-330 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Hitesh Shah Assignee: Sandy Ryza Fix For: 2.0.3-alpha Attachments: org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, YARN-330-1.patch, YARN-330.patch =Seems to be timing related as the container status RUNNING as returned by the ContainerManager does not really indicate that the container task has been launched. Sleep of 5 seconds is not reliable. Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec FAILURE! testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown) Time elapsed: 9283 sec FAILURE! junit.framework.AssertionFailedError: Did not find sigterm message at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162) Logs: 2013-01-09 14:13:08,401 INFO [AsyncDispatcher event handler] container.Container (ContainerImpl.java:handle(835)) - Container container_0__01_00 transitioned from NEW to LOCALIZING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh transitioned from INIT to DOWNLOADING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(521)) - Created localizer for container_0__01_00 2013-01-09 14:13:08,589 INFO [LocalizerRunner for container_0__01_00] localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(895)) - Writing credentials to the nmPrivate file hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens. Credentials list: 2013-01-09 14:13:08,628 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user nobody 2013-01-09 14:13:08,709 INFO [main] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000, 2013-01-09 14:13:08,781 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens to hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
[ https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated YARN-330: Attachment: YARN-330.patch Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown - Key: YARN-330 URL: https://issues.apache.org/jira/browse/YARN-330 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Hitesh Shah Assignee: Sandy Ryza Attachments: YARN-330.patch =Seems to be timing related as the container status RUNNING as returned by the ContainerManager does not really indicate that the container task has been launched. Sleep of 5 seconds is not reliable. Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec FAILURE! testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown) Time elapsed: 9283 sec FAILURE! junit.framework.AssertionFailedError: Did not find sigterm message at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162) Logs: 2013-01-09 14:13:08,401 INFO [AsyncDispatcher event handler] container.Container (ContainerImpl.java:handle(835)) - Container container_0__01_00 transitioned from NEW to LOCALIZING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh transitioned from INIT to DOWNLOADING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(521)) - Created localizer for container_0__01_00 2013-01-09 14:13:08,589 INFO [LocalizerRunner for container_0__01_00] localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(895)) - Writing credentials to the nmPrivate file hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens. Credentials list: 2013-01-09 14:13:08,628 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user nobody 2013-01-09 14:13:08,709 INFO [main] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000, 2013-01-09 14:13:08,781 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens to hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
[ https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated YARN-330: --- Attachment: org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt Yes, I got Did not find sigterm message. The stack trace is basically the same, but with different line numbers. I'm attaching the logs. Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown - Key: YARN-330 URL: https://issues.apache.org/jira/browse/YARN-330 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 3.0.0 Reporter: Hitesh Shah Assignee: Sandy Ryza Attachments: org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, YARN-330.patch =Seems to be timing related as the container status RUNNING as returned by the ContainerManager does not really indicate that the container task has been launched. Sleep of 5 seconds is not reliable. Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec FAILURE! testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown) Time elapsed: 9283 sec FAILURE! junit.framework.AssertionFailedError: Did not find sigterm message at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.assertTrue(Assert.java:20) at org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162) Logs: 2013-01-09 14:13:08,401 INFO [AsyncDispatcher event handler] container.Container (ContainerImpl.java:handle(835)) - Container container_0__01_00 transitioned from NEW to LOCALIZING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh transitioned from INIT to DOWNLOADING 2013-01-09 14:13:08,412 INFO [AsyncDispatcher event handler] localizer.ResourceLocalizationService (ResourceLocalizationService.java:handle(521)) - Created localizer for container_0__01_00 2013-01-09 14:13:08,589 INFO [LocalizerRunner for container_0__01_00] localizer.ResourceLocalizationService (ResourceLocalizationService.java:writeCredentials(895)) - Writing credentials to the nmPrivate file hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens. Credentials list: 2013-01-09 14:13:08,628 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user nobody 2013-01-09 14:13:08,709 INFO [main] containermanager.ContainerManagerImpl (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000, 2013-01-09 14:13:08,781 INFO [LocalizerRunner for container_0__01_00] nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens to hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira