Re: SPIP: Spark on Kubernetes

2017-09-02 Thread Erik Erlandson
We have started discussions about upstreaming and merge strategy at our weekly meetings. The associated github issue is: https://github.com/apache-spark-on-k8s/spark/issues/441 There is general consensus that breaking it up into smaller components will be important for upstream review. Our

Re: SPIP: Spark on Kubernetes

2017-09-01 Thread Reynold Xin
Anirudh (or somebody else familiar with spark-on-k8s), Can you create a short plan on how we would integrate and do code review to merge the project? If the diff is too large it'd be difficult to review and merge in one shot. Once we have a plan we can create subtickets to track the progress.

Re: SPIP: Spark on Kubernetes

2017-08-31 Thread Anirudh Ramanathan
The proposal is in the process of being updated to include the details on testing that we have, that Imran pointed out. Please expect an update on the SPARK-18278 . Mridul had a couple of points as well, about exposing an SPI and we've been

Re: SPIP: Spark on Kubernetes

2017-08-30 Thread Reynold Xin
This has passed, hasn't it? On Tue, Aug 15, 2017 at 5:33 PM Anirudh Ramanathan wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >

Re: SPIP: Spark on Kubernetes

2017-08-30 Thread vaquar khan
+1 (non-binding) Regards, Vaquar khan On Mon, Aug 28, 2017 at 5:09 PM, Erik Erlandson wrote: > > In addition to the engineering & software aspects of the native Kubernetes > community project, we have also worked at building out the community, with > the goal of providing

Re: SPIP: Spark on Kubernetes

2017-08-28 Thread Erik Erlandson
In addition to the engineering & software aspects of the native Kubernetes community project, we have also worked at building out the community, with the goal of providing the foundation for sustaining engineering on the Kubernetes scheduler back-end. That said, I agree 100% with your point that

Re: SPIP: Spark on Kubernetes

2017-08-28 Thread Mark Hamstra
> > In my opinion, the fact that there are nearly no changes to spark-core, > and most of our changes are additive should go to prove that this adds > little complexity to the workflow of the committers. Actually (and somewhat perversely), the otherwise praiseworthy isolation of the Kubernetes

Re: SPIP: Spark on Kubernetes

2017-08-23 Thread Chen YongHua
From: yonzhang2012 <yonzhang2...@apache.org> Sent: Wednesday, August 23, 2017 5:47:16 AM To: dev@spark.apache.org Subject: Re: SPIP: Spark on Kubernetes +1 (non-binding) I am specifically interested in setting up testing environment for my company's Spark use and also expecting more

Re: SPIP: Spark on Kubernetes

2017-08-23 Thread Chen YongHua
pache.org> Sent: Wednesday, August 23, 2017 5:47:16 AM To: dev@spark.apache.org Subject: Re: SPIP: Spark on Kubernetes +1 (non-binding) I am specifically interested in setting up testing environment for my company's Spark use and also expecting more comprehensive documents on getting developme

Re: SPIP: Spark on Kubernetes

2017-08-22 Thread yonzhang2012
+1 (non-binding) I am specifically interested in setting up testing environment for my company's Spark use and also expecting more comprehensive documents on getting development env setup in case of bug fix or new feature development, now it is only briefly documented in

Re: SPIP: Spark on Kubernetes

2017-08-21 Thread Anirudh Ramanathan
Thank you for your comments Imran. Regarding integration tests, What you inferred from the documentation is correct - Integration tests do not require any prior setup or a Kubernetes cluster to run. Minikube is a single binary that brings up a one-node cluster and exposes the full Kubernetes

Re: SPIP: Spark on Kubernetes

2017-08-21 Thread Erik Erlandson
Speaking to integration testing: the integration tests can either attach to an existing cluster, or they can spin up their own minikube cluster to run themselves against. Spark-on-kube can definitely operate without the RSS, as long as spark can find the files it needs using some other

Re: SPIP: Spark on Kubernetes

2017-08-21 Thread Imran Rashid
Overall this looks like a good proposal. I do have some concerns which I'd like to discuss -- please understand I'm taking a "devil's advocate" stance here for discussion, not that I'm giving a -1. My primary concern is about testing and maintenance. My concerns might be addressed if the doc

Re: SPIP: Spark on Kubernetes

2017-08-18 Thread Sudarshan Kadambi
+1 (non-binding) We are evaluating Kubernetes for a variety of data processing workloads. Spark is the natural choice for some of these workloads. Native Spark on Kubernetes is of interest to us as it brings in dynamic allocation, resource isolation and improved notions of security. -- View

Re: SPIP: Spark on Kubernetes

2017-08-18 Thread Matt Cheah
: "dev@spark.apache.org" <dev@spark.apache.org> Subject: Re: SPIP: Spark on Kubernetes There are a fair number of people (myself included) who have interest in making scheduler back-ends fully pluggable. That will represent a significant impact to core spark architecture, with corr

Re: SPIP: Spark on Kubernetes

2017-08-18 Thread varunkatta
+1 (non-binding) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22195.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: SPIP: Spark on Kubernetes

2017-08-18 Thread Erik Erlandson
There are a fair number of people (myself included) who have interest in making scheduler back-ends fully pluggable. That will represent a significant impact to core spark architecture, with corresponding risk. Adding the kubernetes back-end in a manner similar to the other three back-ends has

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread Mridul Muralidharan
While I definitely support the idea of Apache Spark being able to leverage kubernetes, IMO it is better for long term evolution of spark to expose appropriate SPI such that this support need not necessarily live within Apache Spark code base. It will allow for multiple backends to evolve,

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread Chris Fregly
@reynold: Databricks runs their proprietary product on Kubernetes. how about contributing some of that work back to the Open Source Community? — Chris Fregly Founder and Research Engineer @ PipelineAI Founder @ Advanced Spark and TensorFlow Meetup

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread michael mccune
+1 (non-binding) peace o/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread Marcelo Vanzin
I have just some very high level knowledge of kubernetes, so I can't really comment on the details of the proposal that relate to it. But I have some comments about other areas of the linked documents: - It's good to know that there's a community behind this effort and mentions of lots of

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread Matei Zaharia
+1 from me as well. Matei > On Aug 17, 2017, at 10:55 AM, Reynold Xin wrote: > > +1 on adding Kubernetes support in Spark (as a separate module similar to how > YARN is done) > > I talk with a lot of developers and teams that operate cloud services, and > k8s in the

Re: SPIP: Spark on Kubernetes

2017-08-17 Thread Reynold Xin
+1 on adding Kubernetes support in Spark (as a separate module similar to how YARN is done) I talk with a lot of developers and teams that operate cloud services, and k8s in the last year has definitely become one of the key projects, if not the one with the strongest momentum in this space. I'm

Re: SPIP: Spark on Kubernetes

2017-08-16 Thread Alexander Bezzubov
+1 (non-binding) Looking forward using it as part of Apache Spark release, instead of Standalone cluster deployed on top of k8s. -- Alex On Wed, Aug 16, 2017 at 11:11 AM, Ismaël Mejía wrote: > +1 (non-binding) > > This is something really great to have. More schedulers

Re: SPIP: Spark on Kubernetes

2017-08-16 Thread Jean-Baptiste Onofré
+1 as well. Regards JB On Aug 16, 2017, 10:12, at 10:12, "Ismaël Mejía" wrote: >+1 (non-binding) > >This is something really great to have. More schedulers and runtime >environments are a HUGE win for the Spark ecosystem. >Amazing work, Big kudos for the guys who created and

Re: SPIP: Spark on Kubernetes

2017-08-16 Thread Ismaël Mejía
+1 (non-binding) This is something really great to have. More schedulers and runtime environments are a HUGE win for the Spark ecosystem. Amazing work, Big kudos for the guys who created and continue working on this. On Wed, Aug 16, 2017 at 2:07 AM, lucas.g...@gmail.com

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread lucas.g...@gmail.com
>From our perspective, we have invested heavily in Kubernetes as our cluster manager of choice. We also make quite heavy use of spark. We've been experimenting with using these builds (2.1 with pyspark enabled) quite heavily. Given that we've already 'paid the price' to operate Kubernetes in

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Andrew Ash
+1 (non-binding) We're moving large amounts of infrastructure from a combination of open source and homegrown cluster management systems to unify on Kubernetes and want to bring Spark workloads along with us. On Tue, Aug 15, 2017 at 2:29 PM, liyinan926 wrote: > +1

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread liyinan926
+1 (non-binding) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22164.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Shubham Chopra
+1 (non-binding) ~Shubham. On Tue, Aug 15, 2017 at 2:11 PM, Erik Erlandson wrote: > > Kubernetes has evolved into an important container orchestration platform; > it has a large and growing user base and an active ecosystem. Users of > Apache Spark who are also deploying

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Erik Erlandson
Kubernetes has evolved into an important container orchestration platform; it has a large and growing user base and an active ecosystem. Users of Apache Spark who are also deploying applications on Kubernetes (or are planning to) will have convergence-related motivations for migrating their Spark

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Daniel Imberman
+1 (non-binding) Glad to see this moving forward :D On Tue, Aug 15, 2017 at 10:10 AM Holden Karau wrote: > +1 (non-binding) > > I (personally) think that Kubernetes as a scheduler backend should > eventually get merged in and there is clearly a community interested in the

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Holden Karau
+1 (non-binding) I (personally) think that Kubernetes as a scheduler backend should eventually get merged in and there is clearly a community interested in the work required to maintain it. On Tue, Aug 15, 2017 at 9:51 AM William Benton wrote: > +1 (non-binding) > > On Tue,

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread William Benton
+1 (non-binding) On Tue, Aug 15, 2017 at 10:32 AM, Anirudh Ramanathan < fox...@google.com.invalid> wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Timothy Chen
+1 (non-binding) Tim On Tue, Aug 15, 2017 at 9:20 AM, Kimoon Kim wrote: > +1 (non-binding) > > Thanks, > Kimoon > > On Tue, Aug 15, 2017 at 9:19 AM, Sean Suchter > wrote: >> >> +1 (non-binding) >> >> >> >> -- >> View this message in context:

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Kimoon Kim
+1 (non-binding) Thanks, Kimoon On Tue, Aug 15, 2017 at 9:19 AM, Sean Suchter wrote: > +1 (non-binding) > > > > -- > View this message in context: http://apache-spark- > developers-list.1001551.n3.nabble.com/SPIP-Spark-on- > Kubernetes-tp22147p22150.html > Sent

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Sean Suchter
+1 (non-binding) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22150.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Re: SPIP: Spark on Kubernetes

2017-08-15 Thread Erik Erlandson
+1 (non-binding) On Tue, Aug 15, 2017 at 8:32 AM, Anirudh Ramanathan wrote: > Spark on Kubernetes effort has been developed separately in a fork, and > linked back from the Apache Spark project as an experimental backend >