[ 
https://issues.apache.org/jira/browse/TEZ-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14112543#comment-14112543
 ] 

Siddharth Seth commented on TEZ-1494:
-------------------------------------

We end up initializing a vertex when all of the following are met 1) 
initializer is complete, 2) edges are setup, 3) parallelism is not -1. All 
three conditions would be valid for Reducer3, so it would end up allowing Map5 
(dependent vertex) to start.
We currently have no way of knowing whether a Vertex will change parallelism - 
and whether we should block for such an operation. Alternately, we'll have to 
end up updating the downstream tasks with the new parallelism information - 
which may be a better way to deal with this since parallelism could potentially 
change multiple times at a later point.

> DAG hangs waiting for ShuffleManager.getNextInput()
> ---------------------------------------------------
>
>                 Key: TEZ-1494
>                 URL: https://issues.apache.org/jira/browse/TEZ-1494
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Rajesh Balamohan
>              Labels: performance
>         Attachments: TEZ-1494-DAG.dot
>
>
> Attaching the DAG and the stack trace of the hung process.  
> Thread 30071: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Interpreted frame)
>  - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
> line=186 (Interpreted frame)
>  - 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() 
> @bci=42, line=2043 (Interpreted frame)
>  - java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=442 
> (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.shuffle.common.impl.ShuffleManager.getNextInput()
>  @bci=67, line=610 (Interpreted frame)
>  - 
> org.apache.tez.runtime.library.common.readers.UnorderedKVReader.moveToNextInput()
>  @bci=26, line=176 (Interpreted frame)
>  - org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next() 
> @bci=30, line=117 (Interpreted frame)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to