[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown

2013-01-14 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-330:


Attachment: YARN-330-1.patch

 Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
 -

 Key: YARN-330
 URL: https://issues.apache.org/jira/browse/YARN-330
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Hitesh Shah
Assignee: Sandy Ryza
 Attachments: 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, 
 YARN-330-1.patch, YARN-330.patch


 =Seems to be timing related as the container status RUNNING as returned by 
 the ContainerManager does not really indicate that the container task has 
 been launched. Sleep of 5 seconds is not reliable. 
 Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec  
 FAILURE!
 testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown)
   Time elapsed: 9283 sec   FAILURE!
 junit.framework.AssertionFailedError: Did not find sigterm message
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162)
 Logs:
 2013-01-09 14:13:08,401 INFO  [AsyncDispatcher event handler] 
 container.Container (ContainerImpl.java:handle(835)) - Container 
 container_0__01_00 transitioned from NEW to LOCALIZING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource 
 file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh
  transitioned from INIT to DOWNLOADING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:handle(521)) - Created localizer for 
 container_0__01_00
 2013-01-09 14:13:08,589 INFO  [LocalizerRunner for 
 container_0__01_00] localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:writeCredentials(895)) - Writing 
 credentials to the nmPrivate file 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens.
  Credentials list:
 2013-01-09 14:13:08,628 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user 
 nobody
 2013-01-09 14:13:08,709 INFO  [main] containermanager.ContainerManagerImpl 
 (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id 
 {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, 
 attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000,
 2013-01-09 14:13:08,781 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens
  to 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown

2013-01-14 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated YARN-330:
-

Fix Version/s: (was: 3.0.0)
Affects Version/s: (was: 3.0.0)

 Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
 -

 Key: YARN-330
 URL: https://issues.apache.org/jira/browse/YARN-330
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Reporter: Hitesh Shah
Assignee: Sandy Ryza
 Fix For: 2.0.3-alpha

 Attachments: 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, 
 YARN-330-1.patch, YARN-330.patch


 =Seems to be timing related as the container status RUNNING as returned by 
 the ContainerManager does not really indicate that the container task has 
 been launched. Sleep of 5 seconds is not reliable. 
 Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec  
 FAILURE!
 testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown)
   Time elapsed: 9283 sec   FAILURE!
 junit.framework.AssertionFailedError: Did not find sigterm message
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162)
 Logs:
 2013-01-09 14:13:08,401 INFO  [AsyncDispatcher event handler] 
 container.Container (ContainerImpl.java:handle(835)) - Container 
 container_0__01_00 transitioned from NEW to LOCALIZING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource 
 file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh
  transitioned from INIT to DOWNLOADING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:handle(521)) - Created localizer for 
 container_0__01_00
 2013-01-09 14:13:08,589 INFO  [LocalizerRunner for 
 container_0__01_00] localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:writeCredentials(895)) - Writing 
 credentials to the nmPrivate file 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens.
  Credentials list:
 2013-01-09 14:13:08,628 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user 
 nobody
 2013-01-09 14:13:08,709 INFO  [main] containermanager.ContainerManagerImpl 
 (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id 
 {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, 
 attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000,
 2013-01-09 14:13:08,781 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens
  to 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown

2013-01-10 Thread Sandy Ryza (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sandy Ryza updated YARN-330:


Attachment: YARN-330.patch

 Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
 -

 Key: YARN-330
 URL: https://issues.apache.org/jira/browse/YARN-330
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Hitesh Shah
Assignee: Sandy Ryza
 Attachments: YARN-330.patch


 =Seems to be timing related as the container status RUNNING as returned by 
 the ContainerManager does not really indicate that the container task has 
 been launched. Sleep of 5 seconds is not reliable. 
 Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec  
 FAILURE!
 testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown)
   Time elapsed: 9283 sec   FAILURE!
 junit.framework.AssertionFailedError: Did not find sigterm message
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162)
 Logs:
 2013-01-09 14:13:08,401 INFO  [AsyncDispatcher event handler] 
 container.Container (ContainerImpl.java:handle(835)) - Container 
 container_0__01_00 transitioned from NEW to LOCALIZING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource 
 file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh
  transitioned from INIT to DOWNLOADING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:handle(521)) - Created localizer for 
 container_0__01_00
 2013-01-09 14:13:08,589 INFO  [LocalizerRunner for 
 container_0__01_00] localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:writeCredentials(895)) - Writing 
 credentials to the nmPrivate file 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens.
  Credentials list:
 2013-01-09 14:13:08,628 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user 
 nobody
 2013-01-09 14:13:08,709 INFO  [main] containermanager.ContainerManagerImpl 
 (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id 
 {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, 
 attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000,
 2013-01-09 14:13:08,781 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens
  to 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (YARN-330) Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown

2013-01-10 Thread Chris Nauroth (JIRA)

 [ 
https://issues.apache.org/jira/browse/YARN-330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth updated YARN-330:
---

Attachment: 
org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt

Yes, I got Did not find sigterm message.  The stack trace is basically the 
same, but with different line numbers.  I'm attaching the logs.


 Flakey test: TestNodeManagerShutdown#testKillContainersOnShutdown
 -

 Key: YARN-330
 URL: https://issues.apache.org/jira/browse/YARN-330
 Project: Hadoop YARN
  Issue Type: Bug
  Components: nodemanager
Affects Versions: 3.0.0
Reporter: Hitesh Shah
Assignee: Sandy Ryza
 Attachments: 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown-output.txt, 
 YARN-330.patch


 =Seems to be timing related as the container status RUNNING as returned by 
 the ContainerManager does not really indicate that the container task has 
 been launched. Sleep of 5 seconds is not reliable. 
 Running org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown
 Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 9.353 sec  
 FAILURE!
 testKillContainersOnShutdown(org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown)
   Time elapsed: 9283 sec   FAILURE!
 junit.framework.AssertionFailedError: Did not find sigterm message
   at junit.framework.Assert.fail(Assert.java:47)
   at junit.framework.Assert.assertTrue(Assert.java:20)
   at 
 org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown.testKillContainersOnShutdown(TestNodeManagerShutdown.java:162)
 Logs:
 2013-01-09 14:13:08,401 INFO  [AsyncDispatcher event handler] 
 container.Container (ContainerImpl.java:handle(835)) - Container 
 container_0__01_00 transitioned from NEW to LOCALIZING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.LocalizedResource (LocalizedResource.java:handle(194)) - Resource 
 file:hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/tmpDir/scriptFile.sh
  transitioned from INIT to DOWNLOADING
 2013-01-09 14:13:08,412 INFO  [AsyncDispatcher event handler] 
 localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:handle(521)) - Created localizer for 
 container_0__01_00
 2013-01-09 14:13:08,589 INFO  [LocalizerRunner for 
 container_0__01_00] localizer.ResourceLocalizationService 
 (ResourceLocalizationService.java:writeCredentials(895)) - Writing 
 credentials to the nmPrivate file 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens.
  Credentials list:
 2013-01-09 14:13:08,628 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:createUserCacheDirs(373)) - Initializing user 
 nobody
 2013-01-09 14:13:08,709 INFO  [main] containermanager.ContainerManagerImpl 
 (ContainerManagerImpl.java:getContainerStatus(538)) - Returning container_id 
 {, app_attempt_id {, application_id {, id: 0, cluster_timestamp: 0, }, 
 attemptId: 1, }, }, state: C_RUNNING, diagnostics: , exit_status: -1000,
 2013-01-09 14:13:08,781 INFO  [LocalizerRunner for 
 container_0__01_00] nodemanager.DefaultContainerExecutor 
 (DefaultContainerExecutor.java:startLocalizer(99)) - Copying from 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/nmPrivate/container_0__01_00.tokens
  to 
 hadoop-common/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/target/org.apache.hadoop.yarn.server.nodemanager.TestNodeManagerShutdown/nm0/usercache/nobody/appcache/application_0_/container_0__01_00.tokens

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira