[ 
https://issues.apache.org/jira/browse/TEZ-394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052224#comment-16052224
 ] 

Jason Lowe commented on TEZ-394:
--------------------------------

bq. But he was running into lot more pre-emption of tasks in his case which was 
unnecessary and wasteful.

I'm guessing this is caused by root vertices getting lower priority than they 
used to (e.g.: V2 in the example).  Changing the priority doesn't change when 
the vertex becomes runnable and starts asking for tasks.  What could happen is 
that V2, being a root vertex, is runnable right from the start and asks for 
tasks, then later V3 becomes runnable and asks.  V2 is lower priority than V3 
in the new algorithm, so I could see cases where it thinks it needs to preempt 
V2 tasks to make room for V3 tasks.  The scheduler is not DAG-aware when it 
comes to preemption.  I have a new task scheduler that's in the works that 
fixes that (among other things), and I hope to have it posted soon.

If that is indeed what the issue is then I'm not sure we can fix this JIRA 
without first making the preemption smarter (e.g.: DAG-aware so it avoids 
shooting tasks that aren't descendants of the tasks trying to get allocated).

> Better scheduling for uneven DAGs
> ---------------------------------
>
>                 Key: TEZ-394
>                 URL: https://issues.apache.org/jira/browse/TEZ-394
>             Project: Apache Tez
>          Issue Type: Sub-task
>            Reporter: Rohini Palaniswamy
>            Assignee: Jason Lowe
>         Attachments: TEZ-394.001.patch, TEZ-394.002.patch, TEZ-394.003.patch
>
>
>   Consider a series of joins or group by on dataset A with few datasets that 
> takes 10 hours followed by a final join with a dataset X. The vertex that 
> loads dataset X will be one of the top vertexes and initialized early even 
> though its output is not consumed till the end after 10 hours. 
> 1) Could either use delayed start logic for better resource allocation
> 2) Else if they are started upfront, need to handle failure/recovery cases 
> where the nodes which executed the MapTask might have gone down when the 
> final join happens. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to