[ https://issues.apache.org/jira/browse/TEZ-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126451#comment-14126451 ]
Bikas Saha commented on TEZ-1494: --------------------------------- >From what I see, both of these can be replaced by a single boolean, right? We >are only interested in 1 completion. {code}+ int numFinishedTasks; + Boolean taskIsFinished[]; +{code} Not sure how we are preventing multiple schedulings of the tasks because scheduleTasks() is now being called on every onSourceTaskCompleted(). This code should probably be removed since we are trying to test the behavior and not the exact internal impl. The impl could change but the behavior should not. Right? This would also allow us to make this method private. {code} + assertTrue(((ImmediateStartVertexManager)m5.getVertexManager().getPlugin()).canScheduleTasks() == false);{code} Looks like the test is only covering the ImmediateStartManager case. Adding a custom edge between M7 and a new vertex (with the new vertex having a RootInputVertexManager would cover the remaining cases. If that gets hard to write then we should at least add M7 to a new vertex with custom edge (no RootInputManager). > DAG hangs waiting for ShuffleManager.getNextInput() > --------------------------------------------------- > > Key: TEZ-1494 > URL: https://issues.apache.org/jira/browse/TEZ-1494 > Project: Apache Tez > Issue Type: Bug > Reporter: Rajesh Balamohan > Assignee: Rajesh Balamohan > Labels: performance > Attachments: TEZ-1494-DAG.dot, TEZ-1494.1.patch, TEZ-1494.2.patch, > TEZ-1494.3.patch > > > Attaching the DAG and the stack trace of the hung process. > Thread 30071: (state = BLOCKED) > - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame) > - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, > line=186 (Interpreted frame) > - > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() > @bci=42, line=2043 (Interpreted frame) > - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=442 > (Interpreted frame) > - > org.apache.tez.runtime.library.shuffle.common.impl.ShuffleManager.getNextInput() > @bci=67, line=610 (Interpreted frame) > - > org.apache.tez.runtime.library.common.readers.UnorderedKVReader.moveToNextInput() > @bci=26, line=176 (Interpreted frame) > - org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next() > @bci=30, line=117 (Interpreted frame) -- This message was sent by Atlassian JIRA (v6.3.4#6332)