Re: [k8s] Spark operator (the Java one)

2019-10-16 Thread Yinan Li
completed. > > *) It represents an additional channel for exposing kube-specific > features, that might otherwise need to be plumbed through spark-submit or > the k8s backend. > > Cheers, > Erik > > On Thu, Oct 10, 2019 at 9:23 PM Yinan Li wrote: > >> +1. This a

Re: [k8s] Spark operator (the Java one)

2019-10-10 Thread Yinan Li
+1. This and the GCP Spark Operator, although being very useful for k8s users, are not something needed by all Spark users, not even by all Spark on k8s users. On Thu, Oct 10, 2019 at 6:34 PM Stavros Kontopoulos < stavros.kontopou...@lightbend.com> wrote: > Hi all, > > I also left a comment on t

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-17 Thread Yinan Li
+1 (non-binding) On Mon, Jun 17, 2019 at 1:58 PM Ryan Blue wrote: > +1 (non-binding) > > On Sun, Jun 16, 2019 at 11:11 PM Dongjoon Hyun > wrote: > >> +1 >> >> Bests, >> Dongjoon. >> >> >> On Sun, Jun 16, 2019 at 9:41 PM Saisai Shao >> wrote: >> >>> +1 (binding) >>> >>> Thanks >>> Saisai >>> >>

Re: [VOTE] [SPARK-24615] SPIP: Accelerator-aware Scheduling

2019-03-01 Thread Yinan Li
+1 On Fri, Mar 1, 2019 at 12:37 PM Tom Graves wrote: > +1 for the SPIP. > > Tom > > On Friday, March 1, 2019, 8:14:43 AM CST, Xingbo Jiang < > jiangxb1...@gmail.com> wrote: > > > Hi all, > > I want to call for a vote of SPARK-24615 > . It improv

Re: [DISCUSS][K8S][TESTS] Include Kerberos integration tests for Spark 2.4

2018-10-16 Thread Yinan Li
Yep, the Kerberos support for k8s is in the master but not in branch-2.4. I see no reason to get the integration tests into 2.4, which depend on the feature in the master. On Tue, Oct 16, 2018 at 9:32 AM Rob Vesse wrote: > Right now the Kerberos support for Spark on K8S is only on master AFAICT

Re: [DISCUSS][K8S] Local dependencies with Kubernetes

2018-10-08 Thread Yinan Li
rst step would be to get the > basics working and then look at the HA aspect. Although if the above > theoretical approach is practical that could simply be part of restarting > the driver. > > > > Rob > > > > > > *From: *Felix Cheung > *Date: *Sunday, 7 O

Re: [DISCUSS][K8S] Local dependencies with Kubernetes

2018-10-05 Thread Yinan Li
> Just to be clear: in client mode things work right? (Although I'm not really familiar with how client mode works in k8s - never tried it.) If the driver runs on the submission client machine, yes, it should just work. If the driver runs in a pod, however, it faces the same problem as in cluster

Re: [DISCUSS][K8S] Local dependencies with Kubernetes

2018-10-05 Thread Yinan Li
Agreed with Marcelo that this is not a unique problem to Spark on k8s. For a lot of organizations, hosting dependencies on HDFS seems the choice. One option that the Spark Operator does is to automatically upload ap

Re: Python kubernetes spark 2.4 branch

2018-09-25 Thread Yinan Li
Can you give more details on how you ran your app, did you build your own image, and which image are you using? On Tue, Sep 25, 2018 at 10:23 AM Garlapati, Suryanarayana (Nokia - IN/Bangalore) wrote: > Hi, > > I am trying to run spark python testcases on k8s based on tag > spark-2.4-rc1. When th

Re: [DISCUSS][K8S] Supporting advanced pod customisation

2018-09-19 Thread Yinan Li
Thanks for bring this up. My opinion on this is this feature is really targeting advanced use cases that need more customization than what the basic k8s-related Spark config properties offer. So I think it's fair to assume that users who would like to use this feature know the risks and are respons

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-18 Thread Yinan Li
FYI: SPARK-23200 has been resolved. On Tue, Sep 18, 2018 at 8:49 AM Felix Cheung wrote: > If we could work on this quickly - it might get on to future RCs. > > > > -- > *From:* Stavros Kontopoulos > *Sent:* Monday, September 17, 2018 2:35 PM &

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-17 Thread Yinan Li
We can merge the PR and get SPARK-23200 resolved if the whole point is to make streaming on k8s work first. But given that this is not a blocker for 2.4, I think we can take a bit more time here and get it right. With that being said, I would expect it to be resolved soon. On Mon, Sep 17, 2018 at

Spark on Kubernetes plan for 2.4

2018-05-30 Thread Yinan Li
On behalf of folks who work on Spark on Kubernetes, I would like to share a doc on the plan for Spark on Kubernetes features and changes for the upcoming 2.4 release. Please take a look if you are interested. Fe

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Yinan Li
oes. >> >> The only remaining question would then be what a sensible default for >> *spark.kubernetes.executor.cores >> *would be. Seeing that I wanted more than 1 and Yinan wants less, >> leaving it at 1 night be best. >> >> >> >> Thanks, >&

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-30 Thread Yinan Li
PR #20553 is more for allowing users to use a fractional value for cpu requests. The existing spark.executor.cores is sufficient for specifying more than one cpus. > One way to solve this could be to request more than 1 core from Kubernetes per task. Th

Re: [Kubernetes] Resource requests and limits for Driver and Executor Pods

2018-03-29 Thread Yinan Li
Hi David, Regarding cpu limit, in Spark 2.3, we do have the following config properties to specify cpu limit for the driver and executors. See http://spark.apache.org/docs/latest/running-on-kubernetes.html. spark.kubernetes.driver.limit.cores spark.kubernetes.executor.limit.cores On Thu, Mar 29,

Re: Build issues with apache-spark-on-k8s.

2018-03-29 Thread Yinan Li
For 2.3, the dockerfile is under kubernetes/ in the tarball, not under the directory where you started the build. Once you successfully build, copy the tarball out, untar it, and you should see the directory kubernetes/ in it. On Thu, Mar 29, 2018 at 3:00 AM, Atul Sowani wrote: > Thanks all for

Re: Kubernetes: why use init containers?

2018-01-10 Thread Yinan Li
ne thing: downloading dependencies. There could be others that I'm not aware of. On Wed, Jan 10, 2018 at 2:21 PM, Marcelo Vanzin wrote: > On Wed, Jan 10, 2018 at 2:16 PM, Yinan Li wrote: > > but we can not rule out the benefits init-containers bring either. > > Sorry, but what are

Re: Kubernetes: why use init containers?

2018-01-10 Thread Yinan Li
her. Again, I would suggest we look at this more thoroughly post 2.3. On Wed, Jan 10, 2018 at 2:06 PM, Marcelo Vanzin wrote: > On Wed, Jan 10, 2018 at 2:00 PM, Yinan Li wrote: > > I want to re-iterate on one point, that the init-container achieves a > clear > > separation bet

Re: Kubernetes: why use init containers?

2018-01-10 Thread Yinan Li
I want to re-iterate on one point, that the init-container achieves a clear separation between preparing an application and actually running the application. It's a guarantee provided by the K8s admission control and scheduling components that if the init-container fails, the main container won't b

Re: Kubernetes: why use init containers?

2018-01-09 Thread Yinan Li
The init-container is required for use with the resource staging server ( https://github.com/apache-spark-on-k8s/userdocs/blob/master/src/jekyll/running-on-kubernetes.md#resource-staging-server). The resource staging server (RSS) is a spark-on-k8s component running in a Kubernetes cluster for stagi

Re: Kubernetes backend and docker images

2018-01-05 Thread Yinan Li
This is neat. With some code cleanup and as long as users can still use custom driver/executor/init-container images if they want to, I think this is great to have. I don't think there's a particular reason why having a single image wouldn't work. Thanks for doing this! On Fri, Jan 5, 2018 at 5:06

Announcing Spark on Kubernetes release 0.5.0

2017-11-01 Thread Yinan Li
The Spark on Kubernetes development community is pleased to announce release 0.5.0 of Apache Spark with Kubernetes as a native scheduler back-end! This release includes a few bug fixes and the following features: - Spark R support - Kubernetes 1.8 support - Mounts emptyDir volumes for te