[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537617#comment-14537617
 ] 

Jeff Zhang commented on TEZ-2421:
-

Thanks [~bikassaha], Committed to master  branch-0.7

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-10 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537609#comment-14537609
 ] 

Jeff Zhang commented on TEZ-2421:
-

Although couldn't reproduce the deadlock issue in TestAMRecovery, the method 
that passing taskSpec  taskLocation through TaskEventScheduleTask lgtm, +1, 
committing soon.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-10 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537054#comment-14537054
 ] 

Bikas Saha commented on TEZ-2421:
-

I think T3 cannot proceed because the waiting writelock on T1 is going to 
prevent other readlocks from getting acquired (otherwise the writelock would 
starve in the present of a continuous stream of overlapping readlocks).

I think recovery will be fine since during recovery everything is running on 
the central dispatcher and vertex managers are not running (since we dont 
support vertex manager recovery). I have run TestDAGRecovery and TestAMRecovery 
many times and there were no further issues. Before the workaround there were 
issues with them all the time. Yes, TEZ-1019 would provide a better fix.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536935#comment-14536935
 ] 

TezQA commented on TEZ-2421:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12731754/TEZ-2421.3.patch
  against master revision ce69aa1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/658//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/658//console

This message is automatically generated.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-09 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536940#comment-14536940
 ] 

TezQA commented on TEZ-2421:


{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12731755/TEZ-2421.4.patch
  against master revision ce69aa1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/659//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/659//console

This message is automatically generated.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-09 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14536915#comment-14536915
 ] 

Bikas Saha commented on TEZ-2421:
-

bq. I look at the jstack trace, not sure where's the deadlock. App Shared Pool 
- #1 try to acquire VertexImpl's writelock and no other thread has the 
writeblock except some thread also try to acquire the readlock
Thread 1 has V1 readlock acquired and tries to acquire readlock on V2. Thread 2 
wants to acquire writelock on V1 and is blocked because thread 1 has the 
readlock. Thread 3 has writelock on V2 and is trying to acquire readlock on V1 
which is blocked due to the pending writelock on Thread 2. Thus the 3 threads 
have locked each other out. This will repro when TestAMRecovery is run in a 
loop or by running a large job with (specially with 1-1 edges) in a cluster in 
a loop.

Attaching a patch that fixes the locking issues. Verified by running test 
AMRecovery etc. in a loop and a large job in the cluster in a loop.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-09 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537013#comment-14537013
 ] 

Jeff Zhang commented on TEZ-2421:
-


[~bikassaha] I guess you mean the following scenairo:

bq. Thread 1 has V1 readlock acquired and tries to acquire readlock on V2. 
Thread 2 wants to acquire writelock on V1 and is blocked because thread 1 has 
the readlock. Thread 3 has writelock on V2 and is trying to acquire readlock on 
V1 which is blocked due to the pending writelock on Thread 2. 

|| Thread || Owned || Try to acquire ||
| App Shared Pool - #1  (T1)  | |   Writelock of Vertex |
| TaskSchedulerAppCaller - #0 (T2)|  Readlock of Vertex/Task  |  Readlock of 
TaskAttempt |
| Dispatcher thread:Central (T3) | Writelock of TaskAttempt   |  Readlock of 
Vertex |

Still not sure why T3 can't continue, because T1 hasn't got the writelock of 
Vertex, should not block T3, right ?


BTW, the patch may still cause issue in recovery. If it is in recovery, the 
following code in TaskAttempt will still try to acquire the readlock of Vertex, 
and produce the above scenario. But it is supposed can be fixed after TEZ-1019. 
{code}
 TaskSpec createRemoteTaskSpec() throws AMUserCodeException {
TaskSpec baseTaskSpec = task.getBaseTaskSpec();
if (baseTaskSpec == null) {
  // since recovery does not follow normal transitions, 
TaskEventScheduleTask
  // is not being honored by the recovery code path. Using this to 
workaround 
  // until recovery is fixed. Calling the non-locking internal method of 
the vertex
  // to get the taskSpec directly. Since everything happens on the central 
dispatcher 
  // during recovery this is deadlock free for now. TEZ-1019 should remove 
the need for this.
  baseTaskSpec = ((VertexImpl) 
vertex).createRemoteTaskSpec(getID().getTaskID().getId());
}
return new TaskSpec(getID(),
baseTaskSpec.getDAGName(), baseTaskSpec.getVertexName(),
baseTaskSpec.getVertexParallelism(), 
baseTaskSpec.getProcessorDescriptor(),
baseTaskSpec.getInputs(), baseTaskSpec.getOutputs(), 
baseTaskSpec.getGroupInputs());
  }
{code}



 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch, 
 TEZ-2421.4.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-08 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535192#comment-14535192
 ] 

Bikas Saha commented on TEZ-2421:
-

This is happening because recovery code path is directly manipulating the state 
changes of vertex/task/attempt instead of following the normal state 
transitions. TEZ-1019 is tracking this but has not yet been committed. I will 
try to fix this.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-08 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535229#comment-14535229
 ] 

Bikas Saha commented on TEZ-2421:
-

At this point, I will need to investigate the recovery logic further for a 
workaround/fix. Since this issue does not always happen, I suggest removing it 
as a blocker for 0.7.0 to enable the new API's to be consumed by other 
projects. We can follow up immediately with 0.7.1 with a specific fix for this 
issue.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-07 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533734#comment-14533734
 ] 

Bikas Saha commented on TEZ-2421:
-

The main issue is that the attempt takes a lock upwards into the vertex while 
vertex takes locks downwards into the attempt. One way has to be broken to 
prevent deadlock. The key culprits are getting the remoteTaskSpec and getting 
the taskLocation.
Instead of the attempt up-calling into the vertex to get these after getting 
scheduled, the vertex is now sending these to the task when it schedules the 
task. [~zjffdu] [~sseth] [~hitesh] Please review.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
 Attachments: TEZ-2421.1.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533857#comment-14533857
 ] 

TezQA commented on TEZ-2421:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12731352/TEZ-2421.3.patch
  against master revision 05f77fe.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestAMRecovery
  org.apache.tez.test.TestDAGRecovery

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/655//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/655//console

This message is automatically generated.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-07 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533771#comment-14533771
 ] 

TezQA commented on TEZ-2421:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12731341/TEZ-2421.2.patch
  against master revision 05f77fe.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.dag.impl.TestDAGImpl

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/653//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/653//console

This message is automatically generated.

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-07 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533904#comment-14533904
 ] 

Jeff Zhang commented on TEZ-2421:
-

It cause the TestAMRecovery fail.  

{code}
2015-05-08 13:35:25,672 INFO [Dispatcher thread: Central] impl.VertexImpl: 
Source task attempt completed for vertex: vertex_1431063298340_0001_1_01 [v2] 
attempt: attempt_1431063298340_0001_1_00_00_0 with state: SUCCEEDED 
vertexState: RUNNING
2015-05-08 13:35:25,672 ERROR [Dispatcher thread: Central] 
common.AsyncDispatcher: Error in dispatcher thread
java.lang.NullPointerException
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.createRemoteTaskSpec(TaskAttemptImpl.java:461)
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransition.transition(TaskAttemptImpl.java:1012)
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransition.transition(TaskAttemptImpl.java:1)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:673)
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:1)
at 
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:1920)
at 
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:1)
at 
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
{code}

 Deadlock in AM because attempt and vertex locking each other out
 

 Key: TEZ-2421
 URL: https://issues.apache.org/jira/browse/TEZ-2421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Blocker
 Attachments: TEZ-2421.1.patch, TEZ-2421.2.patch, TEZ-2421.3.patch


 Ideally locks should be taken one way - either going down or up. Preferably 
 not going up because most such data can be passed in during object 
 construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2421) Deadlock in AM because attempt and vertex locking each other out

2015-05-05 Thread Bikas Saha (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529679#comment-14529679
 ] 

Bikas Saha commented on TEZ-2421:
-

App Shared Pool - #1 #102 daemon prio=5 os_prio=0 tid=0x02426000 
nid=0x8bd waiting on condition [0x7fa2a841d000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x0006f58b09c0 (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
at 
org.apache.tez.dag.app.dag.impl.VertexImpl.scheduleTasks(VertexImpl.java:1389)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerPluginContextImpl.scheduleVertexTasks(VertexManager.java:206)
- locked 0x0006f58d2c08 (a 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerPluginContextImpl)
at 
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.handleSourceTaskFinished(InputReadyVertexManager.java:277)
at 
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager.onSourceTaskCompleted(InputReadyVertexManager.java:198)
- locked 0x0006f58d2d90 (a 
org.apache.tez.dag.library.vertexmanager.InputReadyVertexManager)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEventSourceTaskCompleted.invoke(VertexManager.java:601)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:656)
- locked 0x0006f58d2d30 (a 
org.apache.tez.dag.app.dag.impl.VertexManager)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent$1.run(VertexManager.java:651)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:651)
at 
org.apache.tez.dag.app.dag.impl.VertexManager$VertexManagerEvent.call(VertexManager.java:640)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

TaskSchedulerAppCaller #0 #92 daemon prio=5 os_prio=0 tid=0x01884800 
nid=0x8af waiting on condition [0x7fa2a9127000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x0007aa165038 (a 
java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
at 
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.getState(TaskAttemptImpl.java:547)
at 
org.apache.tez.dag.app.dag.impl.TaskImpl.selectBestAttempt(TaskImpl.java:715)
at 
org.apache.tez.dag.app.dag.impl.TaskImpl.getProgress(TaskImpl.java:473)
at 
org.apache.tez.dag.app.dag.impl.VertexImpl.computeProgress(VertexImpl.java:1179)
at 
org.apache.tez.dag.app.dag.impl.VertexImpl.getProgress(VertexImpl.java:1117)
at org.apache.tez.dag.app.dag.impl.DAGImpl.getProgress(DAGImpl.java:767)
at 
org.apache.tez.dag.app.DAGAppMaster.getProgress(DAGAppMaster.java:1134)
at 
org.apache.tez.dag.app.rm.TaskSchedulerEventHandler.getProgress(TaskSchedulerEventHandler.java:556)
at 
org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper$GetProgressCallable.call(TaskSchedulerAppCallbackWrapper.java:291)
at 
org.apache.tez.dag.app.rm.TaskSchedulerAppCallbackWrapper$GetProgressCallable.call(TaskSchedulerAppCallbackWrapper.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at