[ https://issues.apache.org/jira/browse/TEZ-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14110104#comment-14110104 ]
Rajesh Balamohan commented on TEZ-1494: --------------------------------------- Issue happens when auto parallelism is enabled. - Reducer 3 starts with 2 tasks - Map 5 (has 1 task and has dependency on Reducer 3) starts before Reducer 3 - Reducer 3 alters parallelism from 2 to 1 - Map 5 keeps waiting for inputs from 2 tasks of Reducer 3. > DAG hangs waiting for ShuffleManager.getNextInput() > --------------------------------------------------- > > Key: TEZ-1494 > URL: https://issues.apache.org/jira/browse/TEZ-1494 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Labels: performance > Attachments: TEZ-1494-DAG.dot > > > Attaching the DAG and the stack trace of the hung process. > Thread 30071: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=186 (Interpreted frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() > @bci=42, line=2043 (Interpreted frame) > - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=442 > (Interpreted frame) > - > org.apache.tez.runtime.library.shuffle.common.impl.ShuffleManager.getNextInput() > @bci=67, line=610 (Interpreted frame) > - > org.apache.tez.runtime.library.common.readers.UnorderedKVReader.moveToNextInput() > @bci=26, line=176 (Interpreted frame) > - org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next() > @bci=30, line=117 (Interpreted frame) -- This message was sent by Atlassian JIRA (v6.2#6252)