[ 
https://issues.apache.org/jira/browse/SPARK-18278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15734033#comment-15734033
 ] 

Reynold Xin commented on SPARK-18278:
-------------------------------------

In the past few days I've given this a lot of thought.

I'm personally very interested in this work, and would actually use it myself. 
That said, based on my experience, the real work starts after the initial thing 
works, i.e. the maintenance and enhancement work in the future will be much 
larger than the initial commit. Adding another officially supported scheduler 
definitely has some serious (and maybe disruptive) impacts to Spark. Some 
examples are ...

1. Testing becomes more complicated.
2. Related to 1, releases become more likely to be delayed. In the past many 
Spark releases were delayed due to bugs in Mesos integration or the YARN 
integration, because those are harder to be tested reliably in an automated 
fashion.
3. The release process has to change.

Given Kubernetes is still very young, and unclear how successful it will be in 
the future (I personally think it will be, but you never know), I would make 
the following, concrete recommendations on moving this forward:

1. See if we can implement this as an add-on (library) outside Spark If not 
possible, what about a fork?
2. Publish some non-official docker images so it is easy to use Spark on 
Kubernetes this way.
3. Encourage users to use it and get feedback. Have the contributors that are 
really interested in this work maintain it for couple Spark releases (this 
includes testing the implementation, publishing new docker images, writing 
documentations).
4. Evaluate later (say 2 releases) how well this has been received on whether 
we take a coordinated effort to merge this into Spark, since it might become 
the most popular cluster manager.



> Support native submission of spark jobs to a kubernetes cluster
> ---------------------------------------------------------------
>
>                 Key: SPARK-18278
>                 URL: https://issues.apache.org/jira/browse/SPARK-18278
>             Project: Spark
>          Issue Type: Umbrella
>          Components: Build, Deploy, Documentation, Scheduler, Spark Core
>            Reporter: Erik Erlandson
>         Attachments: SPARK-18278 - Spark on Kubernetes Design Proposal.pdf
>
>
> A new Apache Spark sub-project that enables native support for submitting 
> Spark applications to a kubernetes cluster.   The submitted application runs 
> in a driver executing on a kubernetes pod, and executors lifecycles are also 
> managed as pods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to