Re: Spark on Mesos vs Yarn
Hi Nik, Bharath is mostly referring to Spark commiters in this thread. Tim On Tue, Jun 9, 2015 at 9:51 PM, Niklas Nielsen nik...@mesosphere.io wrote: Hi Bharath (and rest of Spark dev list!), Just a small shout out: I am a Apache Mesos Committer and would love to help out with anything you need to get this going. Cheers, Nik On 9 June 2015 at 21:10, Bharath Ravi Kumar reachb...@gmail.com wrote: All, Despite the common origin of spark mesos, the stability and adoption of mesos, and the age of the spark-mesos binding, I find the mesos support less mature, with fundamental shortcomings (like framework auth) remaining unresolved. If there's shortage of developer time, I'd be glad to contribute, but it's unclear if the committer group has sufficient time (and priority) to take the mesos support forward. While it has been stated often that support for mesos yarn are equally important, that doesn't seem to translate to visible progress. I'd be glad if my observation is incorrect as I seek better focus and long term commitment on the mesos support. As for the specific issue (6284), I'm happy to build, testing eventually deploy the patch in our production cluster, but I'd rather see it becoming mainstream. Thanks for your consideration. -Bharath On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com wrote: A follow up : considering that spark on mesos is indeed important to databricks, its partners and the community, fundamental issues like spark-6284 shouldn't be languishing for this long. A mesos cluster hosting diverse (i.e.multi-tenant) workloads is a common scenario in production for serious users. The ability to auth a framework assign roles would be a fairly basic ask, one would imagine. Is the lack of time / effort a constraint? If so, I'd be glad to help (as mentioned in the jira). On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș iulian.dra...@typesafe.com wrote: Hi Ankur, Just to add a thought to Tim's excellent answer, Spark on Mesos is very important to us and is the recommended deployment for our customers as Typesafe. Thanks for pointing to your PR, I see Tim already went through a round of reviews. It seems very useful, I'll give it a try as well. thanks, iulian On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Tim, Thanks for such a detailed email. I am excited to hear about the new features, I had a pull request going for adding attribute based filtering in the mesos scheduler but it hasn't received much love - https://github.com/apache/spark/pull/5563 . I am a fan of mesos/marathon/mesosphere and spark ecosystems and trying to push adoption at my workplace. It would love to see documentation, tutorials (anything actually) that would make mesos + spark a better and more fleshed out solution. Would it be possible for you to share some links to the JIRA and pull requests so that I can keep track on the progress/features. Again, thanks for replying. - -- Ankur Chauhan On 15/05/2015 00:39, Tim Chen wrote: Hi Ankur, This is a great question as I've heard similar concerns about Spark on Mesos. At the time when I started to contribute to Spark on Mesos approx half year ago, the Mesos scheduler and related code hasn't really got much attention from anyone and it was pretty much in maintenance mode. As a Mesos PMC that is really interested in Spark I started to refactor and check out different JIRAs and PRs around the Mesos scheduler, and after that started to fix various bugs in Spark, added documentation and also in fix related Mesos issues as well. Just recently for 1.4 we've merged in Cluster mode and Docker support, and there are also pending PRs around framework authentication, multi-role support, dynamic allocation, more finer tuned coarse grain mode scheduling configurations, etc. And finally just want to mention that Mesosphere and Typesafe is collaborating to bring a certified distribution (https://databricks.com/spark/certification/certified-spark-distributi on) of Spark on Mesos and DCOS, and we will be pouring resources into not just maintain Spark on Mesos but drive more features into the Mesos scheduler and also in Mesos so stateful services can leverage new APIs and features to make better scheduling decisions and optimizations. I don't have a solidified roadmap to share yet, but we will be discussing this and hopefully can share with the community soon. In summary Spark on Mesos is not dead or in maintenance mode, and look forward to see a lot more changes from us and the community. Tim On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, This is both a survey type as well as a roadmap query question. It seems like of the cluster options to run spark (i.e. via YARN and
Re: Spark on Mesos vs Yarn
All, Despite the common origin of spark mesos, the stability and adoption of mesos, and the age of the spark-mesos binding, I find the mesos support less mature, with fundamental shortcomings (like framework auth https://issues.apache.org/jira/browse/SPARK-6284) remaining unresolved. If there's shortage of developer time, I'd be glad to contribute, but it's unclear if the committer group has sufficient time (and priority) to take the mesos support forward. While it has been stated often that support for mesos yarn are equally important, that doesn't seem to translate to visible progress. I'd be glad if my observation is incorrect as I seek better focus and long term commitment on the mesos support. As for the specific issue (6284), I'm happy to build, testing eventually deploy the patch in our production cluster, but I'd rather see it becoming mainstream. Thanks for your consideration. -Bharath On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com wrote: A follow up : considering that spark on mesos is indeed important to databricks, its partners and the community, fundamental issues like spark-6284 shouldn't be languishing for this long. A mesos cluster hosting diverse (i.e.multi-tenant) workloads is a common scenario in production for serious users. The ability to auth a framework assign roles would be a fairly basic ask, one would imagine. Is the lack of time / effort a constraint? If so, I'd be glad to help (as mentioned in the jira). On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș iulian.dra...@typesafe.com wrote: Hi Ankur, Just to add a thought to Tim's excellent answer, Spark on Mesos is very important to us and is the recommended deployment for our customers as Typesafe. Thanks for pointing to your PR, I see Tim already went through a round of reviews. It seems very useful, I'll give it a try as well. thanks, iulian On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Tim, Thanks for such a detailed email. I am excited to hear about the new features, I had a pull request going for adding attribute based filtering in the mesos scheduler but it hasn't received much love - https://github.com/apache/spark/pull/5563 . I am a fan of mesos/marathon/mesosphere and spark ecosystems and trying to push adoption at my workplace. It would love to see documentation, tutorials (anything actually) that would make mesos + spark a better and more fleshed out solution. Would it be possible for you to share some links to the JIRA and pull requests so that I can keep track on the progress/features. Again, thanks for replying. - -- Ankur Chauhan On 15/05/2015 00:39, Tim Chen wrote: Hi Ankur, This is a great question as I've heard similar concerns about Spark on Mesos. At the time when I started to contribute to Spark on Mesos approx half year ago, the Mesos scheduler and related code hasn't really got much attention from anyone and it was pretty much in maintenance mode. As a Mesos PMC that is really interested in Spark I started to refactor and check out different JIRAs and PRs around the Mesos scheduler, and after that started to fix various bugs in Spark, added documentation and also in fix related Mesos issues as well. Just recently for 1.4 we've merged in Cluster mode and Docker support, and there are also pending PRs around framework authentication, multi-role support, dynamic allocation, more finer tuned coarse grain mode scheduling configurations, etc. And finally just want to mention that Mesosphere and Typesafe is collaborating to bring a certified distribution (https://databricks.com/spark/certification/certified-spark-distributi on) of Spark on Mesos and DCOS, and we will be pouring resources into not just maintain Spark on Mesos but drive more features into the Mesos scheduler and also in Mesos so stateful services can leverage new APIs and features to make better scheduling decisions and optimizations. I don't have a solidified roadmap to share yet, but we will be discussing this and hopefully can share with the community soon. In summary Spark on Mesos is not dead or in maintenance mode, and look forward to see a lot more changes from us and the community. Tim On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, This is both a survey type as well as a roadmap query question. It seems like of the cluster options to run spark (i.e. via YARN and Mesos), YARN seems to be getting a lot more attention and patches when compared to Mesos. Would it be correct to assume that spark on mesos is more or less a dead or something like a maintenance-only feature and YARN is the recommended way to go? What is the roadmap for spark on mesos? and what is the roadmap for spark on yarn. I like mesos so as much as I would like
Re: Spark on Mesos vs Yarn
Hi Bharath (and rest of Spark dev list!), Just a small shout out: I am a Apache Mesos Committer and would love to help out with anything you need to get this going. Cheers, Nik On 9 June 2015 at 21:10, Bharath Ravi Kumar reachb...@gmail.com wrote: All, Despite the common origin of spark mesos, the stability and adoption of mesos, and the age of the spark-mesos binding, I find the mesos support less mature, with fundamental shortcomings (like framework auth https://issues.apache.org/jira/browse/SPARK-6284) remaining unresolved. If there's shortage of developer time, I'd be glad to contribute, but it's unclear if the committer group has sufficient time (and priority) to take the mesos support forward. While it has been stated often that support for mesos yarn are equally important, that doesn't seem to translate to visible progress. I'd be glad if my observation is incorrect as I seek better focus and long term commitment on the mesos support. As for the specific issue (6284), I'm happy to build, testing eventually deploy the patch in our production cluster, but I'd rather see it becoming mainstream. Thanks for your consideration. -Bharath On Thu, May 28, 2015 at 9:18 AM, Bharath Ravi Kumar reachb...@gmail.com wrote: A follow up : considering that spark on mesos is indeed important to databricks, its partners and the community, fundamental issues like spark-6284 shouldn't be languishing for this long. A mesos cluster hosting diverse (i.e.multi-tenant) workloads is a common scenario in production for serious users. The ability to auth a framework assign roles would be a fairly basic ask, one would imagine. Is the lack of time / effort a constraint? If so, I'd be glad to help (as mentioned in the jira). On Fri, May 15, 2015 at 5:29 PM, Iulian Dragoș iulian.dra...@typesafe.com wrote: Hi Ankur, Just to add a thought to Tim's excellent answer, Spark on Mesos is very important to us and is the recommended deployment for our customers as Typesafe. Thanks for pointing to your PR, I see Tim already went through a round of reviews. It seems very useful, I'll give it a try as well. thanks, iulian On Fri, May 15, 2015 at 9:53 AM, Ankur Chauhan an...@malloc64.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Tim, Thanks for such a detailed email. I am excited to hear about the new features, I had a pull request going for adding attribute based filtering in the mesos scheduler but it hasn't received much love - https://github.com/apache/spark/pull/5563 . I am a fan of mesos/marathon/mesosphere and spark ecosystems and trying to push adoption at my workplace. It would love to see documentation, tutorials (anything actually) that would make mesos + spark a better and more fleshed out solution. Would it be possible for you to share some links to the JIRA and pull requests so that I can keep track on the progress/features. Again, thanks for replying. - -- Ankur Chauhan On 15/05/2015 00:39, Tim Chen wrote: Hi Ankur, This is a great question as I've heard similar concerns about Spark on Mesos. At the time when I started to contribute to Spark on Mesos approx half year ago, the Mesos scheduler and related code hasn't really got much attention from anyone and it was pretty much in maintenance mode. As a Mesos PMC that is really interested in Spark I started to refactor and check out different JIRAs and PRs around the Mesos scheduler, and after that started to fix various bugs in Spark, added documentation and also in fix related Mesos issues as well. Just recently for 1.4 we've merged in Cluster mode and Docker support, and there are also pending PRs around framework authentication, multi-role support, dynamic allocation, more finer tuned coarse grain mode scheduling configurations, etc. And finally just want to mention that Mesosphere and Typesafe is collaborating to bring a certified distribution ( https://databricks.com/spark/certification/certified-spark-distributi on) of Spark on Mesos and DCOS, and we will be pouring resources into not just maintain Spark on Mesos but drive more features into the Mesos scheduler and also in Mesos so stateful services can leverage new APIs and features to make better scheduling decisions and optimizations. I don't have a solidified roadmap to share yet, but we will be discussing this and hopefully can share with the community soon. In summary Spark on Mesos is not dead or in maintenance mode, and look forward to see a lot more changes from us and the community. Tim On Thu, May 14, 2015 at 11:30 PM, Ankur Chauhan an...@malloc64.com mailto:an...@malloc64.com wrote: Hi, This is both a survey type as well as a roadmap query question. It seems like of the cluster options to run spark (i.e. via YARN and Mesos), YARN seems to be getting a lot more attention and patches when compared to Mesos. Would it be