Thank you! Looking forward for it..
On Tue, Jan 19, 2016 at 4:03 PM Tim Chen <t...@mesosphere.io> wrote: > Hi Sathish, > > Sorry about that, I think that's a good idea and I'll write up a section > in the Spark documentation page to explain how it can work. We (Mesosphere) > have been doing this for our DCOS spark for our past releases and has been > working well so far. > > Thanks! > > Tim > > On Tue, Jan 19, 2016 at 12:28 PM, Sathish Kumaran Vairavelu < > vsathishkuma...@gmail.com> wrote: > >> Hi Tim >> >> Do you have any materials/blog for running Spark in a container in Mesos >> cluster environment? I have googled it but couldn't find info on it. Spark >> documentation says it is possible, but no details provided.. Please help >> >> >> Thanks >> >> Sathish >> >> >> >> >> On Mon, Sep 21, 2015 at 11:54 AM Tim Chen <t...@mesosphere.io> wrote: >> >>> Hi John, >>> >>> There is no other blog post yet, I'm thinking to do a series of posts >>> but so far haven't get time to do that yet. >>> >>> Running Spark in docker containers makes distributing spark versions >>> easy, it's simple to upgrade and automatically caches on the slaves so the >>> same image just runs right away. Most of the docker perf is usually related >>> to network and filesystem overheads, but I think with recent changes in >>> Spark to make Mesos sandbox the default temp dir filesystem won't be a big >>> concern as it's mostly writing to the mounted in Mesos sandbox. Also Mesos >>> uses host network by default so network is affected much. >>> >>> Most of the cluster mode limitation is that you need to make the spark >>> job files available somewhere that all the slaves can access remotely >>> (http, s3, hdfs, etc) or available on all slaves locally by path. >>> >>> I'll try to make more doc efforts once I get my existing patches and >>> testing infra work done. >>> >>> Let me know if you have more questions, >>> >>> Tim >>> >>> On Sat, Sep 19, 2015 at 5:42 AM, John Omernik <j...@omernik.com> wrote: >>> >>>> I was searching in the 1.5.0 docs on the Docker on Mesos capabilities >>>> and just found you CAN run it this way. Are there any user posts, blog >>>> posts, etc on why and how you'd do this? >>>> >>>> Basically, at first I was questioning why you'd run spark in a docker >>>> container, i.e., if you run with tar balled executor, what are you really >>>> gaining? And in this setup, are you losing out on performance somehow? (I >>>> am guessing smarter people than I have figured that out). >>>> >>>> Then I came along a situation where I wanted to use a python library >>>> with spark, and it had to be installed on every node, and I realized one >>>> big advantage of dockerized spark would be that spark apps that needed >>>> other libraries could be contained and built well. >>>> >>>> OK, that's huge, let's do that. For my next question there are lot of >>>> "questions" have on how this actually works. Does Clustermode/client mode >>>> apply here? If so, how? Is there a good walk through on getting this >>>> setup? Limitations? Gotchas? Should I just dive in an start working with >>>> it? Has anyone done any stories/rough documentation? This seems like a >>>> really helpful feature to scaling out spark, and letting developers truly >>>> build what they need without tons of admin overhead, so I really want to >>>> explore. >>>> >>>> Thanks! >>>> >>>> John >>>> >>> >>> >