[ 
https://issues.apache.org/jira/browse/TEZ-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14119672#comment-14119672
 ] 

Rajesh Balamohan commented on TEZ-1522:
---------------------------------------

Update:  Patch for TEZ-1494 would not completely solve the issue listed here.  
I was able to simulate out of order execution scenario with the patch by having 
R3->M5 via Scatter_Gather and M7->M5 via broadcast (Instead of having 2 
broadcast edges listed in this JIRA).

> Scheduling can result in out of order execution and slowdown of upstream work
> -----------------------------------------------------------------------------
>
>                 Key: TEZ-1522
>                 URL: https://issues.apache.org/jira/browse/TEZ-1522
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>            Priority: Critical
>              Labels: performance
>         Attachments: TEZ-1522.am.log.gz, task_runtime.svg
>
>
> M2             M7
>     \              /
> (sg) \            /
>        R3        / (b)
>         \       /
>      (b) \     /
>           \   /
>             M5
>             |
>             R6 
> Plz refer to the attachment (task runtime SVG). In this case, M5 got 
> scheduled much earlier than R3 (green color in the diagram) and retained lots 
> of containers.
> R3 got less containers to work with. 
> Attaching the output from the status monitor when the job ran;  Map_5 has 
> taken up almost all of cluster resource, whereas Reducer_3 got fraction of 
> the capacity.
> Map_2: 1/1      Map_5: 0(+373)/1000     Map_7: 1/1      Reducer_3: 0/8000     
>   Reducer_6: 0/1
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 0/8000     
>   Reducer_6: 0/1
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 0(+1)/8000 
>   Reducer_6: 0/1
> ....
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 
> 14(+7)/8000  Reducer_6: 0/1
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 
> 63(+14)/8000 Reducer_6: 0/1
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 
> 159(+22)/8000        Reducer_6: 0/1
> Map_2: 1/1      Map_5: 0(+374)/1000     Map_7: 1/1      Reducer_3: 
> 308(+29)/8000        Reducer_6: 0/1
> ...
> Creating this JIRA as a placeholder for scheduler enhancement. One 
> possibililty could be to
> schedule lesser number of tasks in downstream vertices, based on the 
> information available for the upstream vertex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to