Re: spark-submit exit status on k8s

2020-04-05 Thread Yinan Li
Not sure if you are aware of this new feature in Airflow https://issues.apache.org/jira/browse/AIRFLOW-6542. It's a way to use Airflow to orchestrate spark applications run using the Spark K8S operator ( https://github.com/GoogleCloudPlatform/spark-on-k8s-operator). On Sun, Apr 5, 2020 at 8:25

Re: Spark Kubernetes Architecture: Deployments vs Pods that create Pods

2019-01-29 Thread Yinan Li
Hi Wilson, The behavior of a Deployment doesn't fit with the way Spark executor pods are run and managed. For example, executor pods are created and deleted per the requests from the driver dynamically and normally they run to completion. A Deployment assumes uniformity and statelessness of the

Re: [K8S] Option to keep the executor pods after job finishes

2018-10-09 Thread Yinan Li
There is currently no such an option. But this has been raised before in https://issues.apache.org/jira/browse/SPARK-25515. On Tue, Oct 9, 2018 at 2:17 PM Li Gao wrote: > Hi, > > Is there an option to keep the executor pods on k8s after the job > finishes? We want to extract the logs and stats

Re: Spark 2.3.1: k8s driver pods stuck in Initializing state

2018-09-26 Thread Yinan Li
The spark-init ConfigMap is used for the init-container that is responsible for downloading remote dependencies. The k8s submission client run by spark-submit should create the ConfigMap and add a ConfigMap volume in the driver pod. Can you provide the command you used to run the job? On Wed, Sep

Re: Python kubernetes spark 2.4 branch

2018-09-25 Thread Yinan Li
Can you give more details on how you ran your app, did you build your own image, and which image are you using? On Tue, Sep 25, 2018 at 10:23 AM Garlapati, Suryanarayana (Nokia - IN/Bangalore) wrote: > Hi, > > I am trying to run spark python testcases on k8s based on tag > spark-2.4-rc1. When

Re: [K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-16 Thread Yinan Li
Yes, the init-container has been removed in the master branch. The init-container was used in 2.3.x only for downloading remote dependencies, which is now handled by running spark-submit in the driver. If you need to run custom bootstrap scripts using an init-container, the best option would be to

Re: Kubernetes security context when submitting job through k8s servers

2018-07-09 Thread Yinan Li
? I really want to > try out this feature. > > On Mon, Jul 9, 2018 at 4:48 PM Yinan Li wrote: > >> Spark on k8s currently doesn't support specifying a custom >> SecurityContext of the driver/executor pods. This will be supported by the >> solution to https://issues.apache.org

Re: Kubernetes security context when submitting job through k8s servers

2018-07-09 Thread Yinan Li
Spark on k8s currently doesn't support specifying a custom SecurityContext of the driver/executor pods. This will be supported by the solution to https://issues.apache.org/jira/browse/SPARK-24434. On Mon, Jul 9, 2018 at 2:06 PM trung kien wrote: > Dear all, > > Is there any way to includes

Re: Spark 2.3 driver pod stuck in Running state — Kubernetes

2018-06-08 Thread Yinan Li
Yes, it looks like it is because there's not enough resources to run the executor pods. Have you seen pending executor pods? On Fri, Jun 8, 2018, 11:49 AM Thodoris Zois wrote: > As far as I know from Mesos with Spark, it is a running state and not a > pending one. What you see is normal, but if

Re: [Spark on Google Kubernetes Engine] Properties File Error

2018-04-30 Thread Yinan Li
ache.org/docs/latest/running-on-kubernetes.html. On Mon, Apr 30, 2018 at 12:09 PM, Yinan Li <liyinan...@gmail.com> wrote: > Which version of Spark are you using to run spark-submit, and which > version of Spark your container image is based off? This looks to be caused > my mismatch

Re: [Spark on Google Kubernetes Engine] Properties File Error

2018-04-30 Thread Yinan Li
Which version of Spark are you using to run spark-submit, and which version of Spark your container image is based off? This looks to be caused my mismatched versions of Spark used for spark-submit and for the driver/executor at runtime. On Mon, Apr 30, 2018 at 12:00 PM, Holden Karau

Re: Spark Kubernetes Volumes

2018-04-12 Thread Yinan Li
Hi Marius, Spark on Kubernetes does not yet support mounting user-specified volumes natively. But mounting volume is supported in https://github.com/GoogleCloudPlatform/spark-on-k8s-operator. Please see

Re: Spark on Kubernetes (minikube) 2.3 fails with class not found exception

2018-04-10 Thread Yinan Li
The example jar path should be local:///opt/spark/examples/*jars* /spark-examples_2.11-2.3.0.jar. On Tue, Apr 10, 2018 at 1:34 AM, Dmitry wrote: > Hello spent a lot of time to find what I did wrong , but not found. > I have a minikube WIndows based cluster ( Hyper V as

Re: Scala program to spark-submit on k8 cluster

2018-04-04 Thread Yinan Li
Hi Kittu, What do you mean by "a Scala program"? Do you mean a program that submits a Spark job to a k8s cluster by running spark-submit programmatically, or some example Scala application that is to run on the cluster? On Wed, Apr 4, 2018 at 4:45 AM, Kittu M wrote: > Hi,

Re: Rest API for Spark2.3 submit on kubernetes(version 1.8.*) cluster

2018-03-20 Thread Yinan Li
One option is the Spark Operator . It allows specifying and running Spark applications on Kubernetes using Kubernetes custom resources objects. It takes SparkApplication CRD objects and automatically submits the applications to run on a

Re: Spark 2.3 submit on Kubernetes error

2018-03-11 Thread Yinan Li
Spark on Kubernetes requires the presence of the kube-dns add-on properly configured. The executors connect to the driver through a headless Kubernetes service using the DNS name of the service. Can you check if you have the add-on installed in your cluster? This issue

Re: handling Remote dependencies for spark-submit in spark 2.3 with kubernetes

2018-03-08 Thread Yinan Li
One thing to note is you may need to have the S3 credentials in the init-container unless you use a publicly accessible URL. If this is the case, you can either create a Kubernetes secret and use the Spark config option for mounting secrets (secrets will be mounted into the init-container as well

Re: Spark on K8s - using files fetched by init-container?

2018-02-26 Thread Yinan Li
The files specified through --files are localized by the init-container to /var/spark-data/spark-files by default. So in your case, the file should be located at /var/spark-data/spark-files/flights.csv locally in the container. On Mon, Feb 26, 2018 at 10:51 AM, Jenna Hoole

Re: Spark on K8s with Romana

2018-02-12 Thread Yinan Li
We actually moved away from using the driver pod IP because of https://github.com/apache-spark-on-k8s/spark/issues/482. The current way this works is that the driver url is constructed based on the value of "spark.driver.host" that is set to the DNS name of the headless driver service in the

Announcing Spark on Kubernetes release 0.5.0

2017-11-01 Thread Yinan Li
The Spark on Kubernetes development community is pleased to announce release 0.5.0 of Apache Spark with Kubernetes as a native scheduler back-end! This release includes a few bug fixes and the following features: - Spark R support - Kubernetes 1.8 support - Mounts emptyDir volumes for