Yesha Vora created YARN-8901:
--------------------------------

             Summary: Restart "NEVER" policy does not work with component 
dependency
                 Key: YARN-8901
                 URL: https://issues.apache.org/jira/browse/YARN-8901
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Yesha Vora


Scenario:

1) Launch an application with two components. master and worker. Here, worker 
is dependent on master. ( Worker should be launched only after master is 
launched )
2) Set restart_policy = NEVER for both master and worker. 

{code:title=sample launch.json}
{
        "name": "mawo-hadoop-ut",
        "artifact": {
                "type": "DOCKER",
                "id": "xxx"
        },
        "configuration": {
                "env": {
                       "YARN_CONTAINER_RUNTIME_DOCKER_CONTAINER_NETWORK": 
"hadoop"
                 },
                "properties": {
                       "docker.network": "hadoop"
                }
        },
        "components": [{
                "dependencies": [],
                "resource": {
                        "memory": "2048",
                        "cpus": "1"
                },
                "name": "master",
                "run_privileged_container": true,
                "number_of_containers": 1,
                "launch_command": "start master",
                "restart_policy": "NEVER",
        }, {
                "dependencies": ["master"],
                "resource": {
                        "memory": "8072",
                        "cpus": "1"
                },
                "name": "worker",
                "run_privileged_container": true,
                "number_of_containers": 10,
                "launch_command": "start worker",
                "restart_policy": "NEVER",
        }],
        "lifetime": -1,
        "version": 1.0
}{code}

When restart policy is selected to NEVER, AM never launches Worker component. 
It get stuck with below message. 
{code}
2018-10-17 15:11:58,560 [Component  dispatcher] INFO  component.Component - 
[COMPONENT master] Transitioned from FLEXING to STABLE on CHECK_STABLE event.
2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  instance.ComponentInstance - 
[COMPINSTANCE master-0 : container_e41_1539027682947_0020_01_000002] 
Transitioned from STARTED to READY on BECOME_READY event
2018-10-17 15:11:58,560 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed 
2018-10-17 15:12:28,556 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed 
2018-10-17 15:12:58,556 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed 
2018-10-17 15:13:28,556 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed 
2018-10-17 15:13:58,556 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed 
2018-10-17 15:14:28,556 [pool-7-thread-1] INFO  component.Component - 
[COMPONENT worker]: Dependency master not satisfied, only 1 of 1 instances are 
ready or the dependent component has not completed {code}

'NEVER' restart policy expects master component to be finished before starting 
workers. Master component can not finish the job without workers. Thus, it 
create a deadlock.

The logic for 'NEVER' restart policy should be fixed to allow worker components 
to be launched as soon as master component is in READY state. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to