[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-03 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233321#comment-14233321
 ] 

Marcelo Vanzin commented on SPARK-4694:
---

To answer your question, you can call System.exit() if you want. It's just 
recommended that it's done after you properly shutdown the SparkContext, 
otherwise Yarn won't report your app status correctly. But it seems your patch 
doesn't use System.exit(), so this is kinda moot.

> Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in 
> yarn-client mode
> -
>
> Key: SPARK-4694
> URL: https://issues.apache.org/jira/browse/SPARK-4694
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: SaintBacchus
>
> Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a 
> problem that the driver can't exit by itself.
> To reappear it, you can do as fellow:
> 1.use yarn HA mode and set am.maxAttemp = 1for convenience
> 2.kill the active resouce manager in cluster
> The expect result is just failed, because the maxAttemp was 1.
> But the actual result is that: all executor was ended but the driver was 
> still there and never close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-03 Thread SaintBacchus (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232724#comment-14232724
 ] 

SaintBacchus commented on SPARK-4694:
-

Thanks for reply. [~vanzin] the problem is very sure: the scheduler backend was 
aware of the AM had been exited so it call sc.stop to exit the driver process 
but there was a user thread which was still alive and cause this problem.
To fix this, just using System.exit(-1) instead of the sc.stop so that jvm will 
not wait all the user threads being down and exit clearly.
Can I use System.exit() in spark code?

> Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in 
> yarn-client mode
> -
>
> Key: SPARK-4694
> URL: https://issues.apache.org/jira/browse/SPARK-4694
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: SaintBacchus
>
> Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a 
> problem that the driver can't exit by itself.
> To reappear it, you can do as fellow:
> 1.use yarn HA mode and set am.maxAttemp = 1for convenience
> 2.kill the active resouce manager in cluster
> The expect result is just failed, because the maxAttemp was 1.
> But the actual result is that: all executor was ended but the driver was 
> still there and never close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-03 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232778#comment-14232778
 ] 

Apache Spark commented on SPARK-4694:
-

User 'SaintBacchus' has created a pull request for this issue:
https://github.com/apache/spark/pull/3576

> Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in 
> yarn-client mode
> -
>
> Key: SPARK-4694
> URL: https://issues.apache.org/jira/browse/SPARK-4694
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: SaintBacchus
>
> Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a 
> problem that the driver can't exit by itself.
> To reappear it, you can do as fellow:
> 1.use yarn HA mode and set am.maxAttemp = 1for convenience
> 2.kill the active resouce manager in cluster
> The expect result is just failed, because the maxAttemp was 1.
> But the actual result is that: all executor was ended but the driver was 
> still there and never close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-02 Thread Marcelo Vanzin (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231864#comment-14231864
 ] 

Marcelo Vanzin commented on SPARK-4694:
---

I'm not sure I understand the bug or the context, but there must be some code 
that manages both the SparkContext and the HiveThriftServer2 thread. That code 
is responsible for stopping the context and shutting down the HiveThriftServer2 
thread; if it can't do it cleanly because of some deficiency of the API, it can 
use Thread.stop() or some other less kosher approach.

Using {{System.exit()}} is not recommended because there's no way for the 
backend to detect that without severe performance implications. Apps will 
always be reported as "succeeded" when using that approach.

> Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in 
> yarn-client mode
> -
>
> Key: SPARK-4694
> URL: https://issues.apache.org/jira/browse/SPARK-4694
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: SaintBacchus
>
> Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a 
> problem that the driver can't exit by itself.
> To reappear it, you can do as fellow:
> 1.use yarn HA mode and set am.maxAttemp = 1for convenience
> 2.kill the active resouce manager in cluster
> The expect result is just failed, because the maxAttemp was 1.
> But the actual result is that: all executor was ended but the driver was 
> still there and never close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode

2014-12-02 Thread SaintBacchus (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231440#comment-14231440
 ] 

SaintBacchus commented on SPARK-4694:
-

The reason was that Yarn had reported the status to the RM and the 
YarnClientSchedulerBackend would detect the status to stop sc in function 
'asyncMonitorApplication'.
But the HiveThriftServer2 is a long-run user thread. JVM will never exit until 
all the no-demo threads have ended or using System.exit().
It cause such problem.
The easiest way to reslove this problem is using System.exit(0) instead of 
sc.stop in funciton 'asyncMonitorApplication' .
But system.exit is not recommended in 
https://issues.apache.org/jira/browse/SPARK-4584
Do you have any ideas about this problem? [~vanzin]

> Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in 
> yarn-client mode
> -
>
> Key: SPARK-4694
> URL: https://issues.apache.org/jira/browse/SPARK-4694
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Reporter: SaintBacchus
>
> Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a 
> problem that the driver can't exit by itself.
> To reappear it, you can do as fellow:
> 1.use yarn HA mode and set am.maxAttemp = 1for convenience
> 2.kill the active resouce manager in cluster
> The expect result is just failed, because the maxAttemp was 1.
> But the actual result is that: all executor was ended but the driver was 
> still there and never close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org