[ 
https://issues.apache.org/jira/browse/SLIDER-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14209717#comment-14209717
 ] 

Steve Loughran commented on SLIDER-646:
---------------------------------------

fix is to tell  this launch not to do any waiting for state change, but instead 
just save app report and return, test case can contain all startup logic and 
tests.

While I'm at it: set attempt count to 1, so there's no risk of double-restart

> AgentLaunchFailureIT test failing at times
> ------------------------------------------
>
>                 Key: SLIDER-646
>                 URL: https://issues.apache.org/jira/browse/SLIDER-646
>             Project: Slider
>          Issue Type: Bug
>            Reporter: Gour Saha
>            Assignee: Steve Loughran
>
> Chaos Monkey initial delay should be deterministic. It is currently set to 60 
> seconds. Subsequent interval is also set to 60 secs. However 
> AgentLaunchFailureIT fails at times because the AM does not get sufficient 
> time to startup. In one failure scenario it has been seen to fail within 300 
> ms of Chaos Monkey setup. This test fails about once in every 10 attempts.
> Here is the test output -
> {code}
> ------------------------------------------------------------------------------
> Test set: org.apache.slider.funtest.lifecycle.AgentLaunchFailureIT
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 43.503 sec 
> <<< FAILURE! - in org.apache.slider.funtest.lifecycle.AgentLaunchFailureIT
> testAgentLaunchFailure(org.apache.slider.funtest.lifecycle.AgentLaunchFailureIT)
>   Time elapsed: 40.903 sec  <<< FAILURE!
> java.lang.AssertionError: Application Launch Failure, exit code  68
> Chaos monkey triggered launch failure
>         at org.junit.Assert.fail(Assert.java:88)
>         at 
> org.apache.slider.funtest.framework.CommandTestBase.createTemplatedSliderApplication(CommandTestBase.groovy:676)
>         at 
> org.apache.slider.funtest.lifecycle.AgentLaunchFailureIT.testAgentLaunchFailure(AgentLaunchFailureIT.groovy:71)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at 
> org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}
> Here is the AM log snippet -
> {code}
> 2014-11-12 09:29:34,989 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(764)) - Token YARN_AM_RM_TOKEN
> 2014-11-12 09:29:34,990 [main] INFO  agent.AgentUtils 
> (AgentUtils.java:getApplicationMetainfo(43)) - Reading metainfo at 
> .slider/package/CMD_LOGGER/apache-slider-command-logger.zip
> 2014-11-12 09:29:35,014 [main] INFO  tools.SliderUtils 
> (SliderUtils.java:getApplicationResourceInputStream(1692)) - Reading 
> metainfo.xml of size 1995
> 2014-11-12 09:29:35,096 [main] INFO  agent.AgentUtils 
> (AgentUtils.java:getDefaultConfig(64)) - Reading default config file 
> configuration/cl-site.xml at 
> .slider/package/CMD_LOGGER/apache-slider-command-logger.zip
> 2014-11-12 09:29:35,102 [main] INFO  tools.SliderUtils 
> (SliderUtils.java:getApplicationResourceInputStream(1692)) - Reading 
> configuration/cl-site.xml of size 1270
> 2014-11-12 09:29:35,106 [main] INFO  agent.HeartbeatMonitor 
> (HeartbeatMonitor.java:start(46)) - Starting heartbeat monitor with interval 
> 60000
> 2014-11-12 09:29:35,107 [Thread-36] DEBUG agent.HeartbeatMonitor 
> (HeartbeatMonitor.java:run(65)) - Putting monitor to sleep for 60000 
> milliseconds
> 2014-11-12 09:29:35,181 [main] INFO  state.AppState 
> (AppState.java:buildInstance(502)) - Adding role COMMAND_LOGGER
> 2014-11-12 09:29:35,181 [main] INFO  state.AppState 
> (AppState.java:createDynamicProviderRole(585)) - Role COMMAND_LOGGER assigned 
> priority 1
> 2014-11-12 09:29:35,181 [main] INFO  state.AppState 
> (AppState.java:buildRoleRequirementsFromResources(687)) - Role COMMAND_LOGGER 
> has 0 instances specified
> 2014-11-12 09:29:35,253 [main] DEBUG state.RoleHistory 
> (RoleHistory.java:onBootstrap(370)) - Role history bootstrapped
> 2014-11-12 09:29:35,268 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:maybeStartMonkey(2183)) - Adding Chaos Monkey scheduled 
> every 60 seconds (0 hours -delay 60
> 2014-11-12 09:29:35,269 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:maybeStartMonkey(2195)) - Chaos Monkey has triggered AM 
> Launch failure
> 2014-11-12 09:29:35,269 [main] DEBUG actions.QueueService 
> (QueueService.java:put(85)) - Queueing stop:  exit code = -1, FAILED: Chaos 
> monkey triggered launch failure;
> 2014-11-12 09:29:35,270 [main] DEBUG monkey.ChaosMonkeyService 
> (ChaosMonkeyService.java:addTarget(66)) - Action AM killer not enabled
> 2014-11-12 09:29:35,270 [main] DEBUG monkey.ChaosMonkeyService 
> (ChaosMonkeyService.java:addTarget(66)) - Action Container killer not enabled
> 2014-11-12 09:29:35,270 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:maybeStartMonkey(2222)) - Chaos monkey not started
> 2014-11-12 09:29:35,271 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(879)) - HADOOP_USER_NAME='yarn'
> 2014-11-12 09:29:35,287 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(882)) - Registry service username 
> =yarn
> 2014-11-12 09:29:35,308 [main] DEBUG tools.ConfigHelper 
> (ConfigHelper.java:loadFromResource(511)) - loaded resources from 
> file:/etc/hadoop/conf/yarn-site.xml
> 2014-11-12 09:29:35,342 [main] DEBUG tools.ConfigHelper 
> (ConfigHelper.java:loadFromResource(511)) - loaded resources from 
> file:/etc/hadoop/conf/core-site.xml
> 2014-11-12 09:29:35,375 [main] DEBUG tools.ConfigHelper 
> (ConfigHelper.java:loadFromResource(511)) - loaded resources from 
> file:/etc/hadoop/conf/hdfs-site.xml
> 2014-11-12 09:29:35,437 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:registerServiceInstance(1083)) - Service Record 
> ServiceRecord{description='Slider Application Master'; external endpoints: {{
>   "api" : "http://";,
>   "addressType" : "uri",
>   "protocolType" : "webui",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.management",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/mgmt";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.registry",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/registry";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher.configurations",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher/slider";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher.exports",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher/exports";
>   } ]
> }; }; internal endpoints: {{
>   "api" : "classpath:org.apache.slider.agents.secure",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "https://172.31.11.97:50007/ws/v1/slider/agents";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.agents.oneway",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "https://172.31.11.97:52395/ws/v1/slider/agents";
>   } ]
> }; }, attributes: {"yarn:persistence"="application" 
> "yarn:id"="application_1415782602687_0008" }}
> 2014-11-12 09:29:35,473 [main] INFO  zk.RegistryOperationsService 
> (RegistryOperationsService.java:bind(110)) - Bound at 
> /users/yarn/services/org-apache-slider/test-agent-launchfail : 
> ServiceRecord{description='Slider Application Master'; external endpoints: {{
>   "api" : "http://";,
>   "addressType" : "uri",
>   "protocolType" : "webui",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.management",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/mgmt";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.registry",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/registry";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher.configurations",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher/slider";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.publisher.exports",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "http://172.31.11.97:37091/ws/v1/slider/publisher/exports";
>   } ]
> }; }; internal endpoints: {{
>   "api" : "classpath:org.apache.slider.agents.secure",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "https://172.31.11.97:50007/ws/v1/slider/agents";
>   } ]
> }; {
>   "api" : "classpath:org.apache.slider.agents.oneway",
>   "addressType" : "uri",
>   "protocolType" : "REST",
>   "addresses" : [ {
>     "uri" : "https://172.31.11.97:52395/ws/v1/slider/agents";
>   } ]
> }; }, attributes: {"yarn:persistence"="application" 
> "yarn:id"="application_1415782602687_0008" }}
> 2014-11-12 09:29:35,535 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:registerServiceInstance(1085)) - Registered service 
> under /users/yarn/services/org-apache-slider/test-agent-launchfail; absolute 
> path /registry/users/yarn/services/org-apache-slider/test-agent-launchfail
> 2014-11-12 09:29:35,543 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(888)) - RM Webapp address 
> 172.31.11.99:8088
> 2014-11-12 09:29:35,543 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(889)) - slider Webapp address 
> http://172.31.11.97:37091
> 2014-11-12 09:29:35,543 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:createAndRunCluster(892)) - Application Master 
> Initialization Completed
> 2014-11-12 09:29:35,543 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:startQueueProcessing(463)) - Queue Processing started
> 2014-11-12 09:29:35,544 [AmExecutor-005] INFO  actions.QueueService 
> (QueueService.java:run(171)) - QueueService processor started
> 2014-11-12 09:29:35,545 [AmExecutor-006] INFO  actions.QueueExecutor 
> (QueueExecutor.java:run(68)) - Queue Executor run() started
> 2014-11-12 09:29:35,545 [AmExecutor-006] DEBUG actions.QueueExecutor 
> (QueueExecutor.java:run(71)) - Executing stop:  exit code = -1, FAILED: Chaos 
> monkey triggered launch failure;
> 2014-11-12 09:29:35,545 [AmExecutor-006] INFO  appmaster.SliderAppMaster 
> (ActionStopSlider.java:execute(118)) - SliderAppMasterApi.stopCluster: Chaos 
> monkey triggered launch failure
> 2014-11-12 09:29:35,619 [main] DEBUG agent.AgentClientProvider 
> (AgentClientProvider.java:validateInstanceDefinition(117)) - Validating conf 
> {,
> "internal": {
>   "schema" : "http://example.org/specification/v2.0.0";,
>   "metadata" : {
>     "create.hadoop.deployed.info" : "(no branch) 
> @ae493e14a5f8a78bd6227e6d377bbef6",
>     "create.application.build.info" : "Slider Core-0.60.0.2.2.0.0-1947 Built 
> against commit# ${buildNumber} on Java 1.7.0_67 by yarn",
>     "create.hadoop.build.info" : "2.6.0.2.2.0.0-1947",
>     "create.time.millis" : "1415784562671",
>     "create.time" : "12 Nov 2014 09:29:22 GMT"
>   },
>   "global" : {
>     "internal.generated.conf.path" : 
> "hdfs://172.31.11.99:8020/user/yarn/.slider/cluster/test-agent-launchfail/generated",
>     "application.name" : "test-agent-launchfail",
>     "slider.cluster.directory.permissions" : "0770",
>     "internal.provider.name" : "agent",
>     "internal.data.dir.path" : 
> "hdfs://172.31.11.99:8020/user/yarn/.slider/cluster/test-agent-launchfail/database",
>     "internal.tmp.dir" : 
> "hdfs://172.31.11.99:8020/user/yarn/.slider/cluster/test-agent-launchfail/tmp",
>     "internal.chaos.monkey.probability.amlaunchfailure" : "10000",
>     "internal.snapshot.conf.path" : 
> "hdfs://172.31.11.99:8020/user/yarn/.slider/cluster/test-agent-launchfail/snapshot",
>     "internal.chaos.monkey.interval.seconds" : "60",
>     "slider.data.directory.permissions" : "0770",
>     "internal.container.failure.shortlife" : "60000",
>     "internal.chaos.monkey.enabled" : "true",
>     "internal.am.tmp.dir" : 
> "hdfs://172.31.11.99:8020/user/yarn/.slider/cluster/test-agent-launchfail/tmp/appmaster",
>     "internal.container.failure.threshold" : "5"
>   },
>   "credentials" : { },
>   "components" : { }
> },
> "resources": {
>   "schema" : "http://example.org/specification/v2.0.0";,
>   "metadata" : { },
>   "global" : { },
>   "credentials" : { },
>   "components" : {
>     "slider-appmaster" : {
>       "yarn.memory" : "256",
>       "yarn.vcores" : "1",
>       "yarn.component.instances" : "1"
>     },
>     "COMMAND_LOGGER" : {
>       "yarn.memory" : "128",
>       "yarn.role.priority" : "1",
>       "yarn.component.instances" : "0"
>     }
>   }
> },
> "appConf" :{
>   "schema" : "http://example.org/specification/v2.0.0";,
>   "metadata" : { },
>   "global" : {
>     "site.dfs.namenode.kerberos.principal" : "nn/_h...@example.com",
>     "site.fs.default.name" : "hdfs://172.31.11.99:8020",
>     "site.cl-site.pattern.for.test.to.verify" : "verify this pattern",
>     "site.cl-site.logfile.location" : "${AGENT_LOG_ROOT}/operations.log",
>     "zookeeper.hosts" : "172.31.11.100,172.31.11.96,172.31.11.97",
>     "java_home" : "/usr/jdk64/jdk1.7.0_67",
>     "site.global.application_id" : "CommandLogger",
>     "internal.chaos.monkey.probability.amlaunchfailure" : "10000",
>     "site.fs.defaultFS" : "hdfs://172.31.11.99:8020",
>     "env.MALLOC_ARENA_MAX" : "4",
>     "zookeeper.path" : "/services/slider/users/yarn/test-agent-launchfail",
>     "internal.chaos.monkey.interval.seconds" : "60",
>     "internal.chaos.monkey.enabled" : "true",
>     "zookeeper.quorum" : "172.31.11.100,172.31.11.96,172.31.11.97",
>     "site.global.app_root" : "${AGENT_WORK_ROOT}/app/install/command-logger",
>     "application.def" : 
> ".slider/package/CMD_LOGGER/apache-slider-command-logger.zip",
>     "site.cl-site.datetime.format" : "%A, %d. %B %Y %I:%M%p",
>     "site.global.security_enabled" : "false"
>   },
>   "credentials" : { },
>   "components" : {
>     "slider-appmaster" : {
>       "jvm.heapsize" : "256M",
>       "site.dfs.namenode.kerberos.principal" : "nn/_h...@example.com",
>       "site.fs.default.name" : "hdfs://172.31.11.99:8020",
>       "site.cl-site.pattern.for.test.to.verify" : "verify this pattern",
>       "site.cl-site.logfile.location" : "${AGENT_LOG_ROOT}/operations.log",
>       "zookeeper.hosts" : "172.31.11.100,172.31.11.96,172.31.11.97",
>       "java_home" : "/usr/jdk64/jdk1.7.0_67",
>       "site.global.application_id" : "CommandLogger",
>       "internal.chaos.monkey.probability.amlaunchfailure" : "10000",
>       "site.fs.defaultFS" : "hdfs://172.31.11.99:8020",
>       "env.MALLOC_ARENA_MAX" : "4",
>       "zookeeper.path" : "/services/slider/users/yarn/test-agent-launchfail",
>       "internal.chaos.monkey.interval.seconds" : "60",
>       "internal.chaos.monkey.enabled" : "true",
>       "zookeeper.quorum" : "172.31.11.100,172.31.11.96,172.31.11.97",
>       "site.global.app_root" : 
> "${AGENT_WORK_ROOT}/app/install/command-logger",
>       "application.def" : 
> ".slider/package/CMD_LOGGER/apache-slider-command-logger.zip",
>       "site.cl-site.datetime.format" : "%A, %d. %B %Y %I:%M%p",
>       "site.global.security_enabled" : "false"
>     },
>     "COMMAND_LOGGER" : {
>       "site.dfs.namenode.kerberos.principal" : "nn/_h...@example.com",
>       "site.fs.default.name" : "hdfs://172.31.11.99:8020",
>       "site.cl-site.pattern.for.test.to.verify" : "verify this pattern",
>       "site.cl-site.logfile.location" : "${AGENT_LOG_ROOT}/operations.log",
>       "zookeeper.hosts" : "172.31.11.100,172.31.11.96,172.31.11.97",
>       "java_home" : "/usr/jdk64/jdk1.7.0_67",
>       "site.global.application_id" : "CommandLogger",
>       "internal.chaos.monkey.probability.amlaunchfailure" : "10000",
>       "site.fs.defaultFS" : "hdfs://172.31.11.99:8020",
>       "env.MALLOC_ARENA_MAX" : "4",
>       "zookeeper.path" : "/services/slider/users/yarn/test-agent-launchfail",
>       "internal.chaos.monkey.interval.seconds" : "60",
>       "internal.chaos.monkey.enabled" : "true",
>       "zookeeper.quorum" : "172.31.11.100,172.31.11.96,172.31.11.97",
>       "site.global.app_root" : 
> "${AGENT_WORK_ROOT}/app/install/command-logger",
>       "application.def" : 
> ".slider/package/CMD_LOGGER/apache-slider-command-logger.zip",
>       "site.cl-site.datetime.format" : "%A, %d. %B %Y %I:%M%p",
>       "site.global.security_enabled" : "false"
>     }
>   }
> }}
> 2014-11-12 09:29:35,619 [main] INFO  agent.AgentClientProvider 
> (AgentClientProvider.java:validateInstanceDefinition(133)) - Validating app 
> definition .slider/package/CMD_LOGGER/apache-slider-command-logger.zip
> 2014-11-12 09:29:35,620 [main] DEBUG state.AppState 
> (AppState.java:updateResourceDefinitions(649)) - Updating resources to {
>   "schema" : "http://example.org/specification/v2.0.0";,
>   "metadata" : { },
>   "global" : { },
>   "credentials" : { },
>   "components" : {
>     "slider-appmaster" : {
>       "yarn.memory" : "256",
>       "yarn.vcores" : "1",
>       "yarn.component.instances" : "1"
>     },
>     "COMMAND_LOGGER" : {
>       "yarn.memory" : "128",
>       "yarn.role.priority" : "1",
>       "yarn.component.instances" : "0"
>     }
>   }
> }
> 2014-11-12 09:29:35,704 [main] INFO  state.AppState 
> (AppState.java:buildRoleRequirementsFromResources(687)) - Role COMMAND_LOGGER 
> has 0 instances specified
> 2014-11-12 09:29:35,704 [main] DEBUG state.AppState 
> (AppState.java:resetFailureCounts(1657)) - Resetting failure count of 
> slider-appmaster; was 0
> 2014-11-12 09:29:35,704 [main] DEBUG state.AppState 
> (AppState.java:resetFailureCounts(1657)) - Resetting failure count of 
> COMMAND_LOGGER; was 0
> 2014-11-12 09:29:35,704 [main] DEBUG appmaster.SliderAppMaster 
> (SliderAppMaster.java:reviewRequestAndReleaseNodes(1524)) - 
> reviewRequestAndReleaseNodes(flexCluster)
> 2014-11-12 09:29:35,705 [main] DEBUG actions.QueueService 
> (QueueService.java:put(85)) - Queueing 
> org.apache.slider.server.appmaster.actions.ReviewAndFlexApplicationSize@27f143e9
>  name='flexCluster', delay=0, attrs=4, sequenceNumber=2}
> 2014-11-12 09:29:35,705 [main] DEBUG appmaster.SliderAppMaster 
> (SliderAppMaster.java:waitForAMCompletionSignal(1243)) - blocking until 
> signalled to terminate
> 2014-11-12 09:29:35,705 [AmExecutor-006] DEBUG actions.QueueExecutor 
> (QueueExecutor.java:run(74)) - Completed stop:  exit code = -1, FAILED: Chaos 
> monkey triggered launch failure;
> 2014-11-12 09:29:35,705 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:finish(1288)) - Triggering shutdown of the AM: stop:  
> exit code = -1, FAILED: Chaos monkey triggered launch failure;
> 2014-11-12 09:29:35,706 [AmExecutor-006] DEBUG actions.QueueExecutor 
> (QueueExecutor.java:run(71)) - Executing 
> org.apache.slider.server.appmaster.actions.ReviewAndFlexApplicationSize@27f143e9
>  name='flexCluster', delay=0, attrs=4, sequenceNumber=2}
> 2014-11-12 09:29:35,706 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:stateChanged(1963)) - Process has exited with exit code 
> 0 mapped to 0 -ignoring
> 2014-11-12 09:29:35,706 [main] DEBUG appmaster.SliderAppMaster 
> (SliderAppMaster.java:finish(1299)) - Stopped forked process: exit code=0
> 2014-11-12 09:29:35,707 [main] INFO  workflow.WorkflowCompositeService 
> (WorkflowCompositeService.java:stateChanged(123)) - Child service completed 
> Service RoleLaunchService in state RoleLaunchService: STOPPED
> 2014-11-12 09:29:35,707 [main] INFO  state.AppState 
> (AppState.java:releaseAllContainers(1843)) - Releasing 0 containers
> 2014-11-12 09:29:35,707 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:finish(1317)) - Application completed. Signalling 
> finish to RM
> 2014-11-12 09:29:35,707 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:finish(1320)) - Unregistering AM status=FAILED 
> message=Chaos monkey triggered launch failure
> 2014-11-12 09:29:35,716 [main] INFO  impl.AMRMClientImpl 
> (AMRMClientImpl.java:unregisterApplicationMaster(383)) - Waiting for 
> application to be successfully unregistered.
> 2014-11-12 09:29:35,818 [main] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:runService(529)) - Exiting AM; final exit code = 0
> 2014-11-12 09:29:35,818 [AmExecutor-006] DEBUG appmaster.SliderAppMaster 
> (SliderAppMaster.java:executeNodeReview(1559)) - in 
> executeNodeReview(flexCluster)
> 2014-11-12 09:29:35,818 [main] DEBUG main.ServiceLauncher 
> (ServiceLauncher.java:launchService(189)) - Service exited with exit code 0
> 2014-11-12 09:29:35,818 [AmExecutor-006] INFO  appmaster.SliderAppMaster 
> (SliderAppMaster.java:executeNodeReview(1561)) - Ignoring node review 
> operation: shutdown in progress
> 2014-11-12 09:29:35,818 [AmExecutor-006] DEBUG state.AppState 
> (AppState.java:reviewRequestAndReleaseNodes(1599)) - in 
> reviewRequestAndReleaseNodes()
> 2014-11-12 09:29:35,818 [AmExecutor-006] INFO  state.AppState 
> (AppState.java:reviewOneRole(1684)) - Reviewing 
> RoleStatus{name='COMMAND_LOGGER', key=1, minimum=0, maximum=1, desired=0, 
> actual=0, requested=0, releasing=0, failed=0, started=0, startFailed=0, 
> completed=0, failureMessage=''} : expected 0
> 2014-11-12 09:29:35,819 [AmExecutor-006] DEBUG state.AppState 
> (AppState.java:checkFailureThreshold(1620)) - Failure count of component: 
> COMMAND_LOGGER: 0, threshold=5
> 2014-11-12 09:29:35,819 [AmExecutor-006] DEBUG actions.QueueExecutor 
> (QueueExecutor.java:run(74)) - Completed 
> org.apache.slider.server.appmaster.actions.ReviewAndFlexApplicationSize@27f143e9
>  name='flexCluster', delay=0, attrs=4, sequenceNumber=2}
> 2014-11-12 09:29:35,821 [main] INFO  util.ExitUtil 
> (ExitUtil.java:terminate(124)) - Exiting with status 0
> 2014-11-12 09:29:35,821 [Shutdown] INFO  mortbay.log (Slf4jLog.java:info(67)) 
> - Shutdown hook executing
> 2014-11-12 09:29:35,822 [Shutdown] INFO  mortbay.log (Slf4jLog.java:info(67)) 
> - Stopped SslSelectChannelConnector@0.0.0.0:50007
> 2014-11-12 09:29:35,825 [Shutdown] INFO  mortbay.log (Slf4jLog.java:info(67)) 
> - Stopped SslSelectChannelConnector@0.0.0.0:52395
> 2014-11-12 09:29:35,829 [Shutdown] INFO  mortbay.log (Slf4jLog.java:info(67)) 
> - Shutdown hook complete
> 2014-11-12 09:29:35,834 [Thread-1] INFO  mortbay.log (Slf4jLog.java:info(67)) 
> - Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@0.0.0.0:0
> 2014-11-12 09:29:36,037 [Thread-1] INFO  ipc.Server (Server.java:stop(2437)) 
> - Stopping server on 40684
> 2014-11-12 09:29:36,039 [IPC Server listener on 40684] INFO  ipc.Server 
> (Server.java:run(706)) - Stopping IPC Server listener on 40684
> 2014-11-12 09:29:36,039 [IPC Server Responder] INFO  ipc.Server 
> (Server.java:run(832)) - Stopping IPC Server Responder
> 2014-11-12 09:29:36,040 [Thread-1] DEBUG actions.QueueService 
> (QueueService.java:schedule(91)) - Scheduling 
> org.apache.slider.server.appmaster.actions.ActionStopQueue@9731632 
> name='serviceStop: Service Action Queue in state Action Queue: STOPPED', 
> delay=0, attrs=0, sequenceNumber=3}
> 2014-11-12 09:29:36,040 [AMRM Callback Handler Thread] INFO  
> impl.AMRMClientAsyncImpl (AMRMClientAsyncImpl.java:run(276)) - Interrupted 
> while waiting for queue
> java.lang.InterruptedException
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
>       at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:274)
> 2014-11-12 09:29:36,042 [AmExecutor-005] DEBUG actions.QueueService 
> (QueueService.java:run(176)) - Propagating 
> org.apache.slider.server.appmaster.actions.ActionStopQueue@9731632 
> name='serviceStop: Service Action Queue in state Action Queue: STOPPED', 
> delay=0, attrs=0, sequenceNumber=3}
> 2014-11-12 09:29:36,042 [AmExecutor-005] INFO  actions.QueueService 
> (QueueService.java:run(179)) - QueueService processor terminated
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to