[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17303304#comment-17303304 ] Sergey edited comment on SPARK-27812 at 3/17/21, 11:18 AM: --- It seems, I have reproduced this bug in Spark 3.1.1. If I don't call sparkContext.stop() explicitly, then a Spark driver process doesn't terminate even after its Main method has been completed. There are two non-daemon threads, if I don't call sparkContext.stop(): {code:java} Thread[OkHttp kubernetes.default.svc,5,main] Thread[OkHttp kubernetes.default.svc Writer,5,main]{code} It looks like, it prevents the driver jvm process from terminating. Spark app is started on Amazon EKS (Kubernetes version - 1.17) by _spark-on-k8s-operator: v1beta2-1.2.0-3.0.0_ [(https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)|https://github.com/GoogleCloudPlatform/spark-on-k8s-operator] Spark docker image is built from the official release of spark-3.1.1 hadoop3.2. was (Author: kotlov): It seems, I have reproduced this bug in Spark 3.1.1. If I don't call sparkContext.stop() explicitly, then a Spark driver process doesn't terminate even after its Main method has been completed. There are two non-daemon threads, if I don't call sparkContext.stop(): {code:java} Thread[OkHttp kubernetes.default.svc,5,main] Thread[OkHttp kubernetes.default.svc Writer,5,main]{code} It looks like, it prevents the driver jvm process from terminating. Spark app is started on Amazon EKS (Kubernetes version - 1.17) by [spark-on-k8s-operator: v1beta2-1.2.0-3.0.0|[https://github.com/GoogleCloudPlatform/spark-on-k8s-operator].] Spark docker image is built from the official release of spark-3.1.1 hadoop3.2. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core >Affects Versions: 2.4.3, 2.4.4 >Reporter: Henry Yu >Assignee: Igor Calabria >Priority: Major > Fix For: 2.4.5, 3.0.0 > > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:24 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of a handler, where we can just sys.exit without running any shutdown hook logic though because of the deadlock issue. Btw I dont think we should downgrade we need to move forward and K8s moves fast so we need to do the same. The upgrade happened because the client was very old but jvm exception handling is a pita in general. was (Author: skonto): There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of a handler, where we can just sys.exit without running any shutdown hook logic though because of the deadlock issue. Btw I dont think we should downgrade we need to move forward and K8s moves fast so we need to do the same. The upgrade happened because the client was very old. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:20 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of a handler, where we can just sys.exit without running any shutdown hook logic though because of the deadlock issue. Btw I dont think we should downgrade we need to move forward and K8s moves fast so we need to do the same. The upgrade happened because the client was very old. was (Author: skonto): There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of a handler, where we can just sys.exit without running any shutdown hook logic though because of the deadlock issue. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:19 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of a handler, where we can just sys.exit without running any shutdown hook logic though because of the deadlock issue. was (Author: skonto): There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of handler and where just sys.exit without running any shutdown hook logic though because of the deadlock issue. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:18 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of handler and where just sys.exit without running any shutdown hook logic though because of the deadlock issue. was (Author: skonto): There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of it and just exit without doing running any shutdown hook logic though. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:16 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796]. I am also in favor of it and just exit without doing running any shutdown hook logic though. was (Author: skonto): There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796] btw. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883716#comment-16883716 ] Stavros Kontopoulos edited comment on SPARK-27812 at 7/12/19 11:12 AM: --- There is an issue in general with setting a driver SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796] btw. was (Author: skonto): There is a bigger issue in general with the SparkUncaughtExceptionHandler as described in here: [https://github.com/apache/spark/pull/24796] > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16859710#comment-16859710 ] Henry Yu edited comment on SPARK-27812 at 6/10/19 5:55 AM: --- In our private branch, I fix this and potential non-daemon thread introduced by other third party lib by adding SparkUncaughtExceptionHandler to KubernetesClusterSchedulerBackend . [~dongjoon] Driver Pod doesn't exit because there is a uncaught exception, kubernetes-client failed to call close method , and non-daemon thread block shutdownhook to get executed. With SparkUncaughtExceptionHandler, we can catch user/spark uncaught exception and Call System.exit which triggers shutdownhook to make things better. How about this solution? was (Author: andrew huali): In our private branch, I fix this and potential non-daemon thread introduced by other third party lib by adding SparkUncaughtExceptionHandler to KubernetesClusterSchedulerBackend . [~dongjoon] Driver Pod doesn't exit because there is a uncaught exception in user code, and non-daemon thread block shutdownhook to get executed. With SparkUncaughtExceptionHandler, we can catch user code exception and Call System.exit which triggers shutdownhook to make things better. How about this solution? > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852402#comment-16852402 ] Dongjoon Hyun edited comment on SPARK-27812 at 5/30/19 10:06 PM: - Thank you for reporting, [~Andrew HUALI] and thank you for the investigation, [~igor.calabria]. Can we move forward to resolve the issue by upgrading the libraries? was (Author: dongjoon): Thank you for reporting, [~Andrew HUALI] and thank you for the investigation, [~igor.calabria]. > kubernetes client import non-daemon thread which block jvm exit. > > > Key: SPARK-27812 > URL: https://issues.apache.org/jira/browse/SPARK-27812 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.4.3 >Reporter: Henry Yu >Priority: Major > > I try spark-submit to k8s with cluster mode. Driver pod failed to exit with > An Okhttp Websocket Non-Daemon Thread. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org