[ https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734033#comment-15734033 ]
Reynold Xin commented on SPARK-18278: ------------------------------------- In the past few days I've given this a lot of thought. I'm personally very interested in this work, and would actually use it myself. That said, based on my experience, the real work starts after the initial thing works, i.e. the maintenance and enhancement work in the future will be much larger than the initial commit. Adding another officially supported scheduler definitely has some serious (and maybe disruptive) impacts to Spark. Some examples are ... 1. Testing becomes more complicated. 2. Related to 1, releases become more likely to be delayed. In the past many Spark releases were delayed due to bugs in Mesos integration or the YARN integration, because those are harder to be tested reliably in an automated fashion. 3. The release process has to change. Given Kubernetes is still very young, and unclear how successful it will be in the future (I personally think it will be, but you never know), I would make the following, concrete recommendations on moving this forward: 1. See if we can implement this as an add-on (library) outside Spark If not possible, what about a fork? 2. Publish some non-official docker images so it is easy to use Spark on Kubernetes this way. 3. Encourage users to use it and get feedback. Have the contributors that are really interested in this work maintain it for couple Spark releases (this includes testing the implementation, publishing new docker images, writing documentations). 4. Evaluate later (say 2 releases) how well this has been received on whether we take a coordinated effort to merge this into Spark, since it might become the most popular cluster manager. > Support native submission of spark jobs to a kubernetes cluster > --------------------------------------------------------------- > > Key: SPARK-18278 > URL: https://issues.apache.org/jira/browse/SPARK-18278 > Project: Spark > Issue Type: Umbrella > Components: Build, Deploy, Documentation, Scheduler, Spark Core > Reporter: Erik Erlandson > Attachments: SPARK-18278 - Spark on Kubernetes Design Proposal.pdf > > > A new Apache Spark sub-project that enables native support for submitting > Spark applications to a kubernetes cluster. The submitted application runs > in a driver executing on a kubernetes pod, and executors lifecycles are also > managed as pods. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org