[ https://issues.apache.org/jira/browse/YARN-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751437#comment-16751437 ]
Suma Shivaprasad commented on YARN-8901: ---------------------------------------- Added UT > Restart "NEVER" policy does not work with component dependency > -------------------------------------------------------------- > > Key: YARN-8901 > URL: https://issues.apache.org/jira/browse/YARN-8901 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 3.1.1 > Reporter: Yesha Vora > Assignee: Suma Shivaprasad > Priority: Critical > Attachments: YARN-8901.1.patch, YARN-8901.2.patch > > > Scenario: > 1) Launch an application with two components. master and worker. Here, worker > is dependent on master. ( Worker should be launched only after master is > launched ) > 2) Set restart_policy = NEVER for both master and worker. > {code:title=sample launch.json} > { > "name": "mawo-hadoop-ut", > "artifact": { > "type": "DOCKER", > "id": "xxx" > }, > "configuration": { > "env": { > "YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": > "hadoop" > }, > "properties": { > "docker.network": "hadoop" > } > }, > "components": [{ > "dependencies": [], > "resource": { > "memory": "2048", > "cpus": "1" > }, > "name": "master", > "run_privileged_container": true, > "number_of_containers": 1, > "launch_command": "start master", > "restart_policy": "NEVER", > }, { > "dependencies": ["master"], > "resource": { > "memory": "8072", > "cpus": "1" > }, > "name": "worker", > "run_privileged_container": true, > "number_of_containers": 10, > "launch_command": "start worker", > "restart_policy": "NEVER", > }], > "lifetime": -1, > "version": 1.0 > }{code} > When restart policy is selected to NEVER, AM never launches Worker component. > It get stuck with below message. > {code} > 2018-10-17 15:11:58,560 [Component dispatcher] INFO component.Component - > [COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event. > 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO instance.ComponentInstance - > [COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_000002] > Transitioned from STARTED to READY on BECOME_READY event > 2018-10-17 15:11:58,560 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed > 2018-10-17 15:12:28,556 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed > 2018-10-17 15:12:58,556 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed > 2018-10-17 15:13:28,556 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed > 2018-10-17 15:13:58,556 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed > 2018-10-17 15:14:28,556 [pool-7-thread-1] INFO component.Component - > [COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances > are ready or the dependent component has not completed {code} > 'NEVER' restart policy expects master component to be finished before > starting workers. Master component can not finish the job without workers. > Thus, it create a deadlock. > The logic for 'NEVER' restart policy should be fixed to allow worker > components to be launched as soon as master component is in READY state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org