[ https://issues.apache.org/jira/browse/SAMZA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prateek Maheshwari updated SAMZA-1181: -------------------------------------- Fix Version/s: 0.13.0 > Fix AppMaster hang after submitting jobs to Yarn > ------------------------------------------------ > > Key: SAMZA-1181 > URL: https://issues.apache.org/jira/browse/SAMZA-1181 > Project: Samza > Issue Type: Bug > Affects Versions: 0.13.0 > Reporter: Xinyu Liu > Assignee: Shanthoosh Venkataraman > Priority: Blocker > Fix For: 0.13.0 > > > Currently when a job is submitted to Yarn, it is going to hang after > AppMaster is created. The log shows that it hangs during bootstrapping from > Coordinator stream. Further debugging shows that the jobs hang in the second > time of bootstrap while reading locality data from LocalityManager. The > sequence is the following: > 1. JobModelManager creates CoordinatorStreamConsumer, and bootstrap it, > 2. LocalityManager writes locality info into coordinator stream > 3. JobModelManager closes CoordinatorStreamConsumer (*) > 4. Later localityManager bootstraps CoordinatorStreamConsumer again > Step 3 is the problem here. Since CoordinatorStreamConsumer is still held by > LocalityManager, it cannot be closed prematurely. Step 3 is introduced in > SAMZA-1154, as a refactoring of JobModelManager for task rest end point. To > fix this issue, we will revert this change of step 3. -- This message was sent by Atlassian JIRA (v6.3.15#6346)