rohithsharma created YARN-929:
---------------------------------

             Summary: 2 MRAppMaster spawned for same Job Id
                 Key: YARN-929
                 URL: https://issues.apache.org/jira/browse/YARN-929
             Project: Hadoop YARN
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 2.0.5-alpha
            Reporter: rohithsharma


Configuration : 
    yarn.resourcemanager.am.max-retries = 3

Scenario is 
    NodeManager is killed forcefully i.e using kill -9 NM_PID.
    After Node expiry , RM killed all the container running in this NodeManager.
    But , MRAppMaster JVM is still running.
    RM spawn the 2nd attempt MRAppMaster since am retry is configured as 3.

Problem from running 2 MRApp is 1st attempt appmaster deletes the job 
information from hdfs which cause FileNotFoundException for 2nd attempt MRApp.  
     

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to