Yesha Vora created YARN-8231: -------------------------------- Summary: Dshell application fails when one of the docker container gets killed Key: YARN-8231 URL: https://issues.apache.org/jira/browse/YARN-8231 Project: Hadoop YARN Issue Type: Bug Components: yarn-native-services Reporter: Yesha Vora
1) Launch dshell application {code} yarn jar hadoop-yarn-applications-distributedshell-*.jar -shell_command "sleep 300" -num_containers 2 -shell_env YARN_CONTAINER_RUNTIME_TYPE=docker -shell_env YARN_CONTAINER_RUNTIME_DOCKER_IMAGE=centos/httpd-24-centos7:latest -keep_containers_across_application_attempts -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell-*.jar{code} 2) Kill container_1524681858728_0012_01_000002 Expected behavior: Application should start new instance and finish successfully Actual behavior: Application Failed as soon as container was killed {code:title=AM log} 18/04/27 23:05:12 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1 18/04/27 23:05:12 INFO distributedshell.ApplicationMaster: appattempt_1524681858728_0012_000001 got container status for containerID=container_1524681858728_0012_01_000002, state=COMPLETE, exitStatus=137, diagnostics=[2018-04-27 23:05:09.310]Container killed on request. Exit code is 137 [2018-04-27 23:05:09.331]Container exited with a non-zero exit code 137. [2018-04-27 23:05:09.332]Killed by external signal 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Got response from RM for container ask, completedCnt=1 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: appattempt_1524681858728_0012_000001 got container status for containerID=container_1524681858728_0012_01_000003, state=COMPLETE, exitStatus=0, diagnostics= 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Container completed successfully., containerId=container_1524681858728_0012_01_000003 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application completed. Stopping running containers 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Application completed. Signalling finish to RM 18/04/27 23:08:46 INFO distributedshell.ApplicationMaster: Diagnostics., total=2, completed=2, allocated=2, failed=1 18/04/27 23:08:46 INFO impl.AMRMClientImpl: Waiting for application to be successfully unregistered.{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org