[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2020-07-31 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17168587#comment-17168587
 ] 

Steve Loughran commented on TEZ-1661:
-

Just hit this problem in a hadoop-aws test run inside log4j. Funny that on the 
first page of google results, up come my colleagues and other ASF people.

Did anyone ever come up with a root cause for the hang?

> LocalTaskScheduler hangs when shutdown
> --
>
> Key: TEZ-1661
> URL: https://issues.apache.org/jira/browse/TEZ-1661
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.5.0
> Environment: Local Mode
>Reporter: Oleg Zhurakousky
>Assignee: Jeff Zhang
>Priority: Major
> Fix For: 0.7.0, 0.6.1
>
> Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch
>
>
> LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
> TezClient shuts down (e.g., TezClient.stop).
> Below is jstack output observed when running in Tez local mode:
> {code}
> "Thread-53" prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
> [0x00011df9]
>java.lang.Thread.State: RUNNABLE
> at java.lang.Throwable.fillInStackTrace(Native Method)
> at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
> - locked <0x0007b6ce60a0> (a java.lang.InterruptedException)
> at java.lang.Throwable.(Throwable.java:250)
> at java.lang.Exception.(Exception.java:54)
> at java.lang.InterruptedException.(InterruptedException.java:57)
> at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
> at 
> java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
> at 
> java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
> at 
> org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
> at 
> org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
> at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-21 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14286196#comment-14286196
 ] 

Oleg Zhurakousky commented on TEZ-1661:
---

No, I have not tried it with the patch, but if you say you tested it based on 
that example then I am fine.
Thanks

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Fix For: 0.7.0

 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-18 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282055#comment-14282055
 ] 

Jeff Zhang commented on TEZ-1661:
-

[~ozhurakousky]  I run it as application (not JUnit), and saw the same jstack 
as you. And have verified the issue is addressed by this patch, do you still 
have the issue even with the patch ? 

[~sseth] It should be reproducible, did you remove the System::exit in 
WordCount ?
{code}
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);  // remove it
{code}

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Fix For: 0.7.0

 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-16 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280117#comment-14280117
 ] 

Jeff Zhang commented on TEZ-1661:
-

committed to master

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Fix For: 0.7.0

 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-16 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14280507#comment-14280507
 ] 

Oleg Zhurakousky commented on TEZ-1661:
---

Here is the code to reproduce it:
{code}
public static void main(String[] args) throws Exception {
TezClient client = TezClient.create(foo, new TezConfiguration());
client.start();
client.stop();
System.out.println(Done);
}  
{code}

Make sure you run it as Java application (main) and not JUnit since it will 
essentially do System.exit.

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Fix For: 0.7.0

 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279497#comment-14279497
 ] 

Siddharth Seth commented on TEZ-1661:
-

[~zjffdu] - the patch is required, however I don't think this thread blocks JVM 
shutdown since it's a daemon. Is there a way to reproduce this ?

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279593#comment-14279593
 ] 

Jeff Zhang commented on TEZ-1661:
-

[~sseth] It is not daemon thread. Also verify it through jstack.  
{code}
Thread-33 prio=5 tid=0x7fb553266800 nid=0x6307 runnable 
[0x0001153e2000]
   java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
- locked 0x0007b05c6b40 (a java.lang.InterruptedException)
at java.lang.Throwable.init(Throwable.java:250)
at java.lang.Exception.init(Exception.java:54)
at java.lang.InterruptedException.init(InterruptedException.java:57)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at 
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at 
java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
at 
org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:322)
at 
org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:316)
at java.lang.Thread.run(Thread.java:745)
{code}

bq. Is there a way to reproduce this ?
Add the following in TezExampleBase.createTezClient and remove  system.exit of 
WordCount.java can reproduce it.
{code}
tezConf.setBoolean(TezConfiguration.TEZ_LOCAL_MODE, true);
tezConf.set(fs.defaultFS, file:///);
tezConf.setBoolean(
TezRuntimeConfiguration.TEZ_RUNTIME_OPTIMIZE_LOCAL_FETCH, true);
{code}

Attach a new patch for changing the thread to daemon.



 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279646#comment-14279646
 ] 

Hadoop QA commented on TEZ-1661:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692646/TEZ-1661-2.patch
  against master revision 2544b05.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 68 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/42//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/42//console

This message is automatically generated.

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279637#comment-14279637
 ] 

Siddharth Seth commented on TEZ-1661:
-

I can't reproduce this locally, but the patch looks good. +1. 

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278574#comment-14278574
 ] 

Jeff Zhang commented on TEZ-1661:
-

asyncDelegateRequestThread in LocalTaskSchedulerService is not stopped when 
DAGAppMaster is shutdown in local mode (actually it also happens in non-local 
mode, but we will call system.exit when shutting tez am in non-local mode, so 
it would not hang in non-local mode). The tez-examples don't hang in local mode 
because we always call System.exit when the job is done as following. But it 
doesn't make sense to require user to always do that. Attach a patch for 
addressing this issue. [~sseth], [~jeagles] please help review. 
{code}
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
{code}

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278503#comment-14278503
 ] 

Hadoop QA commented on TEZ-1661:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692481/TEZ-1661-1.patch
  against master revision 61bb0f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 260 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/31//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-tests.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/31//console

This message is automatically generated.

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-13 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14274956#comment-14274956
 ] 

Jeff Zhang commented on TEZ-1661:
-

[~ozhurakousky] Can you still reproduce in master ?

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky

 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-13 Thread Oleg Zhurakousky (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14275366#comment-14275366
 ] 

Oleg Zhurakousky commented on TEZ-1661:
---

Yeah, the issue appears to be in _org.apache.tez.client.LocalClient_ which has 
the following method:
{code}
@Override
 public void stop() {
// LocalClients are shared between TezClient and DAGClients, which can 
cause stop / start / close
// to be invoked multiple times. If modifying these methods - this should 
be factored in.
 }
{code}
Basically in *local* mode call to _TezClient.stop_ results in a call to the 
above method. This means _LocalTaskSchedulerService.stopService_ method is 
never called keeping _asyncDelegateRequestThread_ alive indefinitely. 

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky

 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)