[ 
https://issues.apache.org/jira/browse/SAMZA-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prateek Maheshwari updated SAMZA-1181:
--------------------------------------
    Fix Version/s: 0.13.0

> Fix AppMaster hang after submitting jobs to Yarn
> ------------------------------------------------
>
>                 Key: SAMZA-1181
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1181
>             Project: Samza
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Xinyu Liu
>            Assignee: Shanthoosh Venkataraman
>            Priority: Blocker
>             Fix For: 0.13.0
>
>
> Currently when a job is submitted to Yarn, it is going to hang after 
> AppMaster is created. The log shows that it hangs during bootstrapping from 
> Coordinator stream. Further debugging shows that the jobs hang in the second 
> time of bootstrap while reading locality data from LocalityManager. The 
> sequence is the following:
> 1. JobModelManager creates CoordinatorStreamConsumer, and bootstrap it,
> 2. LocalityManager writes locality info into coordinator stream
> 3. JobModelManager closes CoordinatorStreamConsumer (*)
> 4. Later localityManager bootstraps CoordinatorStreamConsumer again
> Step 3 is the problem here. Since CoordinatorStreamConsumer is still held by 
> LocalityManager, it cannot be closed prematurely. Step 3 is introduced in 
> SAMZA-1154, as a refactoring of JobModelManager for task rest end point. To 
> fix this issue, we will revert this change of step 3.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to