Yesha Vora created YARN-8901: -------------------------------- Summary: Restart "NEVER" policy does not work with component dependency Key: YARN-8901 URL: https://issues.apache.org/jira/browse/YARN-8901 Project: Hadoop YARN Issue Type: Bug Reporter: Yesha Vora
Scenario: 1) Launch an application with two components. master and worker. Here, worker is dependent on master. ( Worker should be launched only after master is launched ) 2) Set restart_policy = NEVER for both master and worker. {code:title=sample launch.json} { "name": "mawo-hadoop-ut", "artifact": { "type": "DOCKER", "id": "xxx" }, "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": "hadoop" }, "properties": { "docker.network": "hadoop" } }, "components": [{ "dependencies": [], "resource": { "memory": "2048", "cpus": "1" }, "name": "master", "run_privileged_container": true, "number_of_containers": 1, "launch_command": "start master", "restart_policy": "NEVER", }, { "dependencies": ["master"], "resource": { "memory": "8072", "cpus": "1" }, "name": "worker", "run_privileged_container": true, "number_of_containers": 10, "launch_command": "start worker", "restart_policy": "NEVER", }], "lifetime": -1, "version": 1.0 }{code} When restart policy is selected to NEVER, AM never launches Worker component. It get stuck with below message. {code} 2018-10-17 15:11:58,560 [Component dispatcher] INFO component.Component - [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event. 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO instance.ComponentInstance - [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_000002] Transitioned from STARTED to READY on BECOME_READY event 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO component.Component - [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are ready or the dependent component has not completed {code} 'NEVER' restart policy expects master component to be finished before starting workers. Master component can not finish the job without workers. Thus, it create a deadlock. The logic for 'NEVER' restart policy should be fixed to allow worker components to be launched as soon as master component is in READY state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org