[ https://issues.apache.org/jira/browse/TEZ-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bikas Saha updated TEZ-1522: ---------------------------- Priority: Critical (was: Blocker) > ShuffleVertexManager scheduling can result in out of order execution and > slowdown of upstream work > -------------------------------------------------------------------------------------------------- > > Key: TEZ-1522 > URL: https://issues.apache.org/jira/browse/TEZ-1522 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Priority: Critical > Labels: performance > Attachments: task_runtime.svg > > > M2 M7 > \ / > (sg) \ / > R3 / (b) > \ / > (b) \ / > \ / > M5 > | > R6 > Plz refer to the attachment (task runtime SVG). In this case, M5 got > scheduled much earlier than R3 (green color in the diagram) and retained lots > of containers. > R3 got less containers to work with. > Attaching the output from the status monitor when the job ran; Map_5 has > taken up almost all of cluster resource, whereas Reducer_3 got fraction of > the capacity. > Map_2: 1/1 Map_5: 0(+373)/1000 Map_7: 1/1 Reducer_3: 0/8000 > Reducer_6: 0/1 > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 0/8000 > Reducer_6: 0/1 > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: 0(+1)/8000 > Reducer_6: 0/1 > .... > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: > 14(+7)/8000 Reducer_6: 0/1 > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: > 63(+14)/8000 Reducer_6: 0/1 > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: > 159(+22)/8000 Reducer_6: 0/1 > Map_2: 1/1 Map_5: 0(+374)/1000 Map_7: 1/1 Reducer_3: > 308(+29)/8000 Reducer_6: 0/1 > ... > Creating this JIRA as a placeholder for scheduler enhancement. One > possibililty could be to > schedule lesser number of tasks in downstream vertices, based on the > information available for the upstream vertex. -- This message was sent by Atlassian JIRA (v6.2#6252)