[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14233321#comment-14233321 ] Marcelo Vanzin commented on SPARK-4694: --- To answer your question, you can call System.exit() if you want. It's just recommended that it's done after you properly shutdown the SparkContext, otherwise Yarn won't report your app status correctly. But it seems your patch doesn't use System.exit(), so this is kinda moot. > Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in > yarn-client mode > - > > Key: SPARK-4694 > URL: https://issues.apache.org/jira/browse/SPARK-4694 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: SaintBacchus > > Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a > problem that the driver can't exit by itself. > To reappear it, you can do as fellow: > 1.use yarn HA mode and set am.maxAttemp = 1for convenience > 2.kill the active resouce manager in cluster > The expect result is just failed, because the maxAttemp was 1. > But the actual result is that: all executor was ended but the driver was > still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232724#comment-14232724 ] SaintBacchus commented on SPARK-4694: - Thanks for reply. [~vanzin] the problem is very sure: the scheduler backend was aware of the AM had been exited so it call sc.stop to exit the driver process but there was a user thread which was still alive and cause this problem. To fix this, just using System.exit(-1) instead of the sc.stop so that jvm will not wait all the user threads being down and exit clearly. Can I use System.exit() in spark code? > Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in > yarn-client mode > - > > Key: SPARK-4694 > URL: https://issues.apache.org/jira/browse/SPARK-4694 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: SaintBacchus > > Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a > problem that the driver can't exit by itself. > To reappear it, you can do as fellow: > 1.use yarn HA mode and set am.maxAttemp = 1for convenience > 2.kill the active resouce manager in cluster > The expect result is just failed, because the maxAttemp was 1. > But the actual result is that: all executor was ended but the driver was > still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14232778#comment-14232778 ] Apache Spark commented on SPARK-4694: - User 'SaintBacchus' has created a pull request for this issue: https://github.com/apache/spark/pull/3576 > Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in > yarn-client mode > - > > Key: SPARK-4694 > URL: https://issues.apache.org/jira/browse/SPARK-4694 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: SaintBacchus > > Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a > problem that the driver can't exit by itself. > To reappear it, you can do as fellow: > 1.use yarn HA mode and set am.maxAttemp = 1for convenience > 2.kill the active resouce manager in cluster > The expect result is just failed, because the maxAttemp was 1. > But the actual result is that: all executor was ended but the driver was > still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231864#comment-14231864 ] Marcelo Vanzin commented on SPARK-4694: --- I'm not sure I understand the bug or the context, but there must be some code that manages both the SparkContext and the HiveThriftServer2 thread. That code is responsible for stopping the context and shutting down the HiveThriftServer2 thread; if it can't do it cleanly because of some deficiency of the API, it can use Thread.stop() or some other less kosher approach. Using {{System.exit()}} is not recommended because there's no way for the backend to detect that without severe performance implications. Apps will always be reported as "succeeded" when using that approach. > Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in > yarn-client mode > - > > Key: SPARK-4694 > URL: https://issues.apache.org/jira/browse/SPARK-4694 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: SaintBacchus > > Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a > problem that the driver can't exit by itself. > To reappear it, you can do as fellow: > 1.use yarn HA mode and set am.maxAttemp = 1for convenience > 2.kill the active resouce manager in cluster > The expect result is just failed, because the maxAttemp was 1. > But the actual result is that: all executor was ended but the driver was > still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4694) Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in yarn-client mode
[ https://issues.apache.org/jira/browse/SPARK-4694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231440#comment-14231440 ] SaintBacchus commented on SPARK-4694: - The reason was that Yarn had reported the status to the RM and the YarnClientSchedulerBackend would detect the status to stop sc in function 'asyncMonitorApplication'. But the HiveThriftServer2 is a long-run user thread. JVM will never exit until all the no-demo threads have ended or using System.exit(). It cause such problem. The easiest way to reslove this problem is using System.exit(0) instead of sc.stop in funciton 'asyncMonitorApplication' . But system.exit is not recommended in https://issues.apache.org/jira/browse/SPARK-4584 Do you have any ideas about this problem? [~vanzin] > Long-run user thread(such as HiveThriftServer2) causes the 'process leak' in > yarn-client mode > - > > Key: SPARK-4694 > URL: https://issues.apache.org/jira/browse/SPARK-4694 > Project: Spark > Issue Type: Bug > Components: YARN >Reporter: SaintBacchus > > Recently when I use the Yarn HA mode to test the HiveThriftServer2 I found a > problem that the driver can't exit by itself. > To reappear it, you can do as fellow: > 1.use yarn HA mode and set am.maxAttemp = 1for convenience > 2.kill the active resouce manager in cluster > The expect result is just failed, because the maxAttemp was 1. > But the actual result is that: all executor was ended but the driver was > still there and never close. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org