Re: Spark on Mesos vs Yarn

2015-06-09 Thread Timothy Chen
Hi Nik,

Bharath is mostly referring to Spark commiters in this thread.

Tim

On Tue, Jun 9, 2015 at 9:51 PM, Niklas Nielsen nik...@mesosphere.io wrote:
 Hi Bharath (and rest of Spark dev list!),

 Just a small shout out: I am a Apache Mesos Committer and would love to help
 out with anything you need to get this going.

 Cheers,
 Nik

 On 9 June 2015 at 21:10, Bharath Ravi Kumar reachb...@gmail.com wrote:

 All,

 Despite the common origin of spark  mesos, the stability and adoption of
 mesos, and the age of the spark-mesos binding, I find the mesos support less
 mature, with fundamental shortcomings (like framework auth) remaining
 unresolved. If there's shortage of developer time, I'd be glad to
 contribute, but it's unclear if the committer group has sufficient time (and
 priority) to take the mesos support forward. While it has been stated often
 that support for mesos  yarn are equally important, that doesn't seem to
 translate to visible progress. I'd be glad if my observation is incorrect as
 I seek better focus and long term commitment on the mesos support.
 As for the specific issue (6284), I'm happy to build, testing  eventually
 deploy the patch in our production cluster, but I'd rather see it becoming
 mainstream.
 Thanks for your consideration.

 -Bharath


 On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com
 wrote:

 A follow up : considering that spark on mesos is indeed important to
 databricks, its partners and the community, fundamental issues like
 spark-6284 shouldn't be languishing for this long. A mesos cluster hosting
 diverse (i.e.multi-tenant)  workloads is a common scenario in production for
 serious users. The ability to auth a framework  assign roles would be a
 fairly basic ask, one would imagine. Is the lack of time / effort a
 constraint? If so, I'd be glad to help (as mentioned in the jira).

 On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș
 iulian.dra...@typesafe.com wrote:

 Hi Ankur,

 Just to add a thought to Tim's excellent answer, Spark on Mesos is very
 important to us and is the recommended deployment for our customers as
 Typesafe.

 Thanks for pointing to your PR, I see Tim already went through a round
 of reviews. It seems very useful, I'll give it a try as well.

 thanks,
 iulian



 On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com
 wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi Tim,

 Thanks for such a detailed email. I am excited to hear about the new
 features, I had a pull request going for adding attribute based
 filtering in the mesos scheduler but it hasn't received much love -
 https://github.com/apache/spark/pull/5563 . I am a fan of
 mesos/marathon/mesosphere and spark ecosystems and trying to push
 adoption at my workplace.

 It would love to see documentation, tutorials (anything actually) that
 would make mesos + spark a better and more fleshed out solution. Would
 it be possible for you to share some links to the JIRA and pull
 requests so that I can keep track on the progress/features.

 Again, thanks for replying.

 - -- Ankur Chauhan

 On 15/05/2015 00:39, Tim Chen wrote:
  Hi Ankur,
 
  This is a great question as I've heard similar concerns about Spark
  on Mesos.
 
  At the time when I started to contribute to Spark on Mesos approx
  half year ago, the Mesos scheduler and related code hasn't really
  got much attention from anyone and it was pretty much in
  maintenance mode.
 
  As a Mesos PMC that is really interested in Spark I started to
  refactor and check out different JIRAs and PRs around the Mesos
  scheduler, and after that started to fix various bugs in Spark,
  added documentation and also in fix related Mesos issues as well.
 
  Just recently for 1.4 we've merged in Cluster mode and Docker
  support, and there are also pending PRs around framework
  authentication, multi-role support, dynamic allocation, more finer
  tuned coarse grain mode scheduling configurations, etc.
 
  And finally just want to mention that Mesosphere and Typesafe is
  collaborating to bring a certified distribution
 
  (https://databricks.com/spark/certification/certified-spark-distributi
 on)
  of Spark on Mesos and DCOS, and we will be pouring resources into
  not just maintain Spark on Mesos but drive more features into the
  Mesos scheduler and also in Mesos so stateful services can leverage
  new APIs and features to make better scheduling decisions and
  optimizations.
 
  I don't have a solidified roadmap to share yet, but we will be
  discussing this and hopefully can share with the community soon.
 
  In summary Spark on Mesos is not dead or in maintenance mode, and
  look forward to see a lot more changes from us and the community.
 
  Tim
 
  On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan
  an...@malloc64.com mailto:an...@malloc64.com wrote:
 
  Hi,
 
  This is both a survey type as well as a roadmap query question. It
  seems like of the cluster options to run spark (i.e. via YARN and
  

Re: Spark on Mesos vs Yarn

2015-06-09 Thread Bharath Ravi Kumar
All,

Despite the common origin of spark  mesos, the stability and adoption of
mesos, and the age of the spark-mesos binding, I find the mesos support
less mature, with fundamental shortcomings (like framework auth
https://issues.apache.org/jira/browse/SPARK-6284) remaining unresolved.
If there's shortage of developer time, I'd be glad to contribute, but it's
unclear if the committer group has sufficient time (and priority) to take
the mesos support forward. While it has been stated often that support for
mesos  yarn are equally important, that doesn't seem to translate to
visible progress. I'd be glad if my observation is incorrect as I seek
better focus and long term commitment on the mesos support.
As for the specific issue (6284), I'm happy to build, testing  eventually
deploy the patch in our production cluster, but I'd rather see it becoming
mainstream.
Thanks for your consideration.

-Bharath


On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com
wrote:

 A follow up : considering that spark on mesos is indeed important to
 databricks, its partners and the community, fundamental issues like
 spark-6284 shouldn't be languishing for this long. A mesos cluster hosting
 diverse (i.e.multi-tenant)  workloads is a common scenario in production
 for serious users. The ability to auth a framework  assign roles would be
 a fairly basic ask, one would imagine. Is the lack of time / effort a
 constraint? If so, I'd be glad to help (as mentioned in the jira).

 On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș iulian.dra...@typesafe.com
  wrote:

 Hi Ankur,

 Just to add a thought to Tim's excellent answer, Spark on Mesos is very
 important to us and is the recommended deployment for our customers as
 Typesafe.

 Thanks for pointing to your PR, I see Tim already went through a round of
 reviews. It seems very useful, I'll give it a try as well.

 thanks,
 iulian



 On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com
 wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi Tim,

 Thanks for such a detailed email. I am excited to hear about the new
 features, I had a pull request going for adding attribute based
 filtering in the mesos scheduler but it hasn't received much love -
 https://github.com/apache/spark/pull/5563 . I am a fan of
 mesos/marathon/mesosphere and spark ecosystems and trying to push
 adoption at my workplace.

 It would love to see documentation, tutorials (anything actually) that
 would make mesos + spark a better and more fleshed out solution. Would
 it be possible for you to share some links to the JIRA and pull
 requests so that I can keep track on the progress/features.

 Again, thanks for replying.

 - -- Ankur Chauhan

 On 15/05/2015 00:39, Tim Chen wrote:
  Hi Ankur,
 
  This is a great question as I've heard similar concerns about Spark
  on Mesos.
 
  At the time when I started to contribute to Spark on Mesos approx
  half year ago, the Mesos scheduler and related code hasn't really
  got much attention from anyone and it was pretty much in
  maintenance mode.
 
  As a Mesos PMC that is really interested in Spark I started to
  refactor and check out different JIRAs and PRs around the Mesos
  scheduler, and after that started to fix various bugs in Spark,
  added documentation and also in fix related Mesos issues as well.
 
  Just recently for 1.4 we've merged in Cluster mode and Docker
  support, and there are also pending PRs around framework
  authentication, multi-role support, dynamic allocation, more finer
  tuned coarse grain mode scheduling configurations, etc.
 
  And finally just want to mention that Mesosphere and Typesafe is
  collaborating to bring a certified distribution
  (https://databricks.com/spark/certification/certified-spark-distributi
 on)
  of Spark on Mesos and DCOS, and we will be pouring resources into
  not just maintain Spark on Mesos but drive more features into the
  Mesos scheduler and also in Mesos so stateful services can leverage
  new APIs and features to make better scheduling decisions and
  optimizations.
 
  I don't have a solidified roadmap to share yet, but we will be
  discussing this and hopefully can share with the community soon.
 
  In summary Spark on Mesos is not dead or in maintenance mode, and
  look forward to see a lot more changes from us and the community.
 
  Tim
 
  On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan
  an...@malloc64.com mailto:an...@malloc64.com wrote:
 
  Hi,
 
  This is both a survey type as well as a roadmap query question. It
  seems like of the cluster options to run spark (i.e. via YARN and
  Mesos), YARN seems to be getting a lot more attention and patches
  when compared to Mesos.
 
  Would it be correct to assume that spark on mesos is more or less
  a dead or something like a maintenance-only feature and YARN is
  the recommended way to go?
 
  What is the roadmap for spark on mesos? and what is the roadmap
  for spark on yarn. I like mesos so as much as I would like 

Re: Spark on Mesos vs Yarn

2015-06-09 Thread Niklas Nielsen
Hi Bharath (and rest of Spark dev list!),

Just a small shout out: I am a Apache Mesos Committer and would love to
help out with anything you need to get this going.

Cheers,
Nik

On 9 June 2015 at 21:10, Bharath Ravi Kumar reachb...@gmail.com wrote:

 All,

 Despite the common origin of spark  mesos, the stability and adoption of
 mesos, and the age of the spark-mesos binding, I find the mesos support
 less mature, with fundamental shortcomings (like framework auth
 https://issues.apache.org/jira/browse/SPARK-6284) remaining unresolved.
 If there's shortage of developer time, I'd be glad to contribute, but it's
 unclear if the committer group has sufficient time (and priority) to take
 the mesos support forward. While it has been stated often that support for
 mesos  yarn are equally important, that doesn't seem to translate to
 visible progress. I'd be glad if my observation is incorrect as I seek
 better focus and long term commitment on the mesos support.
 As for the specific issue (6284), I'm happy to build, testing  eventually
 deploy the patch in our production cluster, but I'd rather see it becoming
 mainstream.
 Thanks for your consideration.

 -Bharath


 On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com
 wrote:

 A follow up : considering that spark on mesos is indeed important to
 databricks, its partners and the community, fundamental issues like
 spark-6284 shouldn't be languishing for this long. A mesos cluster hosting
 diverse (i.e.multi-tenant)  workloads is a common scenario in production
 for serious users. The ability to auth a framework  assign roles would be
 a fairly basic ask, one would imagine. Is the lack of time / effort a
 constraint? If so, I'd be glad to help (as mentioned in the jira).

 On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș 
 iulian.dra...@typesafe.com wrote:

 Hi Ankur,

 Just to add a thought to Tim's excellent answer, Spark on Mesos is very
 important to us and is the recommended deployment for our customers as
 Typesafe.

 Thanks for pointing to your PR, I see Tim already went through a round
 of reviews. It seems very useful, I'll give it a try as well.

 thanks,
 iulian



 On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com
 wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi Tim,

 Thanks for such a detailed email. I am excited to hear about the new
 features, I had a pull request going for adding attribute based
 filtering in the mesos scheduler but it hasn't received much love -
 https://github.com/apache/spark/pull/5563 . I am a fan of
 mesos/marathon/mesosphere and spark ecosystems and trying to push
 adoption at my workplace.

 It would love to see documentation, tutorials (anything actually) that
 would make mesos + spark a better and more fleshed out solution. Would
 it be possible for you to share some links to the JIRA and pull
 requests so that I can keep track on the progress/features.

 Again, thanks for replying.

 - -- Ankur Chauhan

 On 15/05/2015 00:39, Tim Chen wrote:
  Hi Ankur,
 
  This is a great question as I've heard similar concerns about Spark
  on Mesos.
 
  At the time when I started to contribute to Spark on Mesos approx
  half year ago, the Mesos scheduler and related code hasn't really
  got much attention from anyone and it was pretty much in
  maintenance mode.
 
  As a Mesos PMC that is really interested in Spark I started to
  refactor and check out different JIRAs and PRs around the Mesos
  scheduler, and after that started to fix various bugs in Spark,
  added documentation and also in fix related Mesos issues as well.
 
  Just recently for 1.4 we've merged in Cluster mode and Docker
  support, and there are also pending PRs around framework
  authentication, multi-role support, dynamic allocation, more finer
  tuned coarse grain mode scheduling configurations, etc.
 
  And finally just want to mention that Mesosphere and Typesafe is
  collaborating to bring a certified distribution
  (
 https://databricks.com/spark/certification/certified-spark-distributi
 on)
  of Spark on Mesos and DCOS, and we will be pouring resources into
  not just maintain Spark on Mesos but drive more features into the
  Mesos scheduler and also in Mesos so stateful services can leverage
  new APIs and features to make better scheduling decisions and
  optimizations.
 
  I don't have a solidified roadmap to share yet, but we will be
  discussing this and hopefully can share with the community soon.
 
  In summary Spark on Mesos is not dead or in maintenance mode, and
  look forward to see a lot more changes from us and the community.
 
  Tim
 
  On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan
  an...@malloc64.com mailto:an...@malloc64.com wrote:
 
  Hi,
 
  This is both a survey type as well as a roadmap query question. It
  seems like of the cluster options to run spark (i.e. via YARN and
  Mesos), YARN seems to be getting a lot more attention and patches
  when compared to Mesos.
 
  Would it be