[ 
https://issues.apache.org/jira/browse/TEZ-3803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107962#comment-16107962
 ] 

Jason Lowe commented on TEZ-3803:
---------------------------------

Thanks for updating the patch!

I'm still confused why we need a background thread for reporting progress, 
because a thread seems like overkill for this.  Why is it not sufficient to 
ping progress in the thread that is already waiting for fetchers to complete or 
new inputs rather than creating a new thread so the old one can wait forever?

> Tasks can get killed due to insufficient progress while waiting for shuffle 
> inputs to complete
> ----------------------------------------------------------------------------------------------
>
>                 Key: TEZ-3803
>                 URL: https://issues.apache.org/jira/browse/TEZ-3803
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>            Priority: Critical
>         Attachments: TEZ-3803.001.patch, TEZ-3803.002.patch, 
> TEZ-3803.003.patch
>
>
> In a scenario where a downstream task has no slow start and gets started 
> before all its shuffle inputs are done, the task can timeout as the wait does 
> not notify progress( set the "progress is being made bit") like it does in 
> MapReduce.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to