[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691973#comment-17691973 ] Apache Spark commented on SPARK-26365: -- User 'zwangsheng' has created a pull request for this issue: https://github.com/apache/spark/pull/40118 > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17691972#comment-17691972 ] Apache Spark commented on SPARK-26365: -- User 'zwangsheng' has created a pull request for this issue: https://github.com/apache/spark/pull/40118 > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17679807#comment-17679807 ] Mayank Asthana commented on SPARK-26365: {quote}Spark submit command exit code ($?) as 0 is okay as there is no error in job submission. {quote} > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17601827#comment-17601827 ] Shrikant Prasad commented on SPARK-26365: - Spark submit command exit code ($?) as 0 is okay as there is no error in job submission. It's the job which failed and that info we do get in container exit code (1). When job submission fails, we do get proper exit code. So it doesn't seems to be a bug. {code:java} container status: container name: spark-kubernetes-driver container image: ** container state: terminated container started at: 2022-09-08T13:40:39Z container finished at: 2022-09-08T13:40:43Z exit code: 1 termination reason: Error {code} > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529867#comment-17529867 ] Mayank Asthana commented on SPARK-26365: [~oscar.bonilla] Your change looks good. Can you open a pull request on [https://github.com/apache/spark] for an official review? > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17490154#comment-17490154 ] Yilun Fan commented on SPARK-26365: --- Can anyone review and merge this patch? We had the same problem with exit code. > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438174#comment-17438174 ] Naresh commented on SPARK-26365: Yes. Its not fixed in 3.x yet. I am using spark 3.2 and still see the issue > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437830#comment-17437830 ] Oscar Bonilla commented on SPARK-26365: --- I've changed the priority to Major, to see if someone can pick it up and fix it > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0 >Reporter: Oscar Bonilla >Priority: Major > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437825#comment-17437825 ] Vivien Brissat commented on SPARK-26365: Hi [~oscar.bonilla], this is not since i made tests in version 3.1, and found the Jira issue when i looked for a solution to my problem. > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437197#comment-17437197 ] Oscar Bonilla commented on SPARK-26365: --- Hi [~Gangishetty], unfortunately I'm not a contributor to the apache Spark project. I only reported the issue because I found it. You'll have to follow the usual channels if you want this issue prioritized. Although I assume it's probably already been fixed in the 3.x branch > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436144#comment-17436144 ] Naresh commented on SPARK-26365: [~oscar.bonilla] Any plans to prioritize issue?? This will definitely lock the spark usage with K8s > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422001#comment-17422001 ] Vivien Brissat commented on SPARK-26365: Hello, Have same issues, in a kubernetes cluster the Spark-Submit does not receive Drivers state, in deploy-mode cluster. Since this issue is old and have no "official" answer, i just up the subject :) Thanks in advance, it's a real issue, dont know why it is "minor" Regards, Vivien > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224799#comment-17224799 ] Itay Bittan commented on SPARK-26365: - thanks [~oscar.bonilla]. We ended up with a temporary solution: {code:java} spark-submit .. 2>&1 | tee output.log ; grep -q \"exit code: 0\" output.log{code} > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224648#comment-17224648 ] Oscar Cassetti commented on SPARK-26365: [~itayb] I have a patch [^spark-3.0.0-raise-exception-k8s-failure.patch] which I tested for spark-3.0.0. It is not pretty but it does the job I also have one for v2.4.5 [^spark-2.4.5-raise-exception-k8s-failure.patch] again the code is a bit ugly but I have been using it in production since June > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, > spark-3.0.0-raise-exception-k8s-failure.patch > > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17224569#comment-17224569 ] Itay Bittan commented on SPARK-26365: - Hi, we are having the same issue. It's critical in a scenario that triggers another job based on the first app success/failure. Any idea for a workaround meanwhile? > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17135130#comment-17135130 ] Oscar Cassetti commented on SPARK-26365: I think I identified the issue but I am not 100% sure how to fix it. In org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala Client.run only loads the error code but it does not check it I tried to add the following code but it does not work as the getContainerStatuses() returns an empty list {code:java} if (waitForAppCompletion) { logInfo(s"Waiting for application $appName to finish...") watcher.awaitCompletion() logInfo(s"Application $appName finished.") if (createdDriverPod != null) { val statuses = createdDriverPod.getStatus.getContainerStatuses logInfo(s"Current Driver pod ${statuses.size()} ") statuses.asScala.map(s => parseStatusCode(s)) } } else { logInfo(s"Deployed Spark application $appName into Kubernetes.") } } } private def parseStatusCode(containerStatus: ContainerStatus): Unit = { val exitCode = containerStatus.getState.getTerminated.getExitCode logInfo(s"Container exit $exitCode ") if (exitCode != 0) { logInfo("Container exited with non zero code") throw new SparkException(s"Unexpected container exit status ${exitCode}.") } } {code} [~liyinan926], could you advise what should be the right way to handle this? > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17134862#comment-17134862 ] Oscar Cassetti commented on SPARK-26365: I can see the same issue and I think it is due to this [https://github.com/apache/spark/blob/f535004e14b197ceb1f2108a67b033c052d65bcb/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala#L214] and the `io.fabric8.kubernetes.client.KubernetesClient` The watcher Steps to reproduces {code:java} spark-submit \ --master k8s://https://172.17.0.2:8443 \ --deploy-mode cluster \ --name ocassetti-test \ --conf spark.executor.instances=2 \ --conf spark.kubernetes.namespace=spark \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ --py-files https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/lib.zip \ --conf spark.kubernetes.pyspark.pythonVersion="3" \ --files https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/data.txt \ --conf spark.kubernetes.container.image=gcr.io/spark-operator/spark-py:v2.4.5 \ https://raw.githubusercontent.com/ocassetti/spark-docker/master/samples/main.py {code} {code:java} Container name: spark-kubernetes-driver Container name: spark-kubernetes-driver Container image: gcr.io/spark-operator/spark-py:v2.4.5 Container state: Terminated Exit code: 1 20/06/14 00:29:48 INFO submit.Client: Application ocassetti-test finished. 20/06/14 00:29:48 INFO util.ShutdownHookManager: Shutdown hook called 20/06/14 00:29:48 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-3924793f-9b83-4361-9491-c858f26ae9e0 {code} > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17094932#comment-17094932 ] Lorenzo Pisani commented on SPARK-26365: I'm also seeing this behavior specifically with a "cluster" deploy mode. The driver pod is failing properly but the pod that executed spark-submit is exiting with a status code of 0. This makes it very difficult to monitor the job and detect failures. > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16802887#comment-16802887 ] Maxime Nannan commented on SPARK-26365: --- I had a similar problem and I've created a docker image if you want to reproduce it. Docker image has been created with the following script {code:java} curl http://mirror.easyname.ch/apache/spark/spark-2.4.0/spark-2.4.0-bin-hadoop2.7.tgz | tar -xz echo "raise RuntimeError()" > test.py cp test.py spark-2.4.0-bin-hadoop2.7/python/lib cd spark-2.4.0-bin-hadoop2.7 ./bin/docker-image-tool.sh -t issue_exit_code -r mnannanph/spark build ./bin/docker-image-tool.sh -t issue_exit_code -r mnannanph/spark push{code} The docker image is public on dockerhub [https://cloud.docker.com/repository/docker/mnannanph/spark-py] and can be pulled by running {noformat} docker pull mnannanph/spark-py:issue_exit_code{noformat} If you run the following command on a kubernetes cluster that can run spark on kubernetes {code:java} MASTER_URL= # Fill with your kubernetes master ip bin/spark-submit \ --master ${MASTER_URL} \ --deploy-mode cluster \ --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \ --name spark-test \ --conf spark.executor.instances=1 \ --conf spark.kubernetes.container.image=mnannanph/spark-py:issue_exit_code\ /opt/spark/python/lib/test.py{code} You will obtain an exit code 0 whereas the spark-submit has failed. The script executed raises an error so the exit code should be 1. test.py is just a one-line python script with this command {code:java} raise RuntimeError() {code} Find below one part of driver pod logs: {code:java} 2019-03-27 14:38:43 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Traceback (most recent call last): File "/opt/spark/python/lib/test.py", line 1, in raise RuntimeError() RuntimeError 2019-03-27 14:38:44 INFO ShutdownHookManager:54 - Shutdown hook called 2019-03-27 14:38:44 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-9d85097e-c848-4c29-8e67-a0d0c0cdeb71 {code} Hope this will help. > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code
[ https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774789#comment-16774789 ] Udbhav Agrawal commented on SPARK-26365: can you specify some scenarios because i tried the same and i am able to get correct exit codes. > spark-submit for k8s cluster doesn't propagate exit code > > > Key: SPARK-26365 > URL: https://issues.apache.org/jira/browse/SPARK-26365 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Submit >Affects Versions: 2.3.2, 2.4.0 >Reporter: Oscar Bonilla >Priority: Minor > > When launching apps using spark-submit in a kubernetes cluster, if the Spark > applications fails (returns exit code = 1 for example), spark-submit will > still exit gracefully and return exit code = 0. > This is problematic, since there's no way to know if there's been a problem > with the Spark application. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org