Thank you! Looking forward for it..

On Tue, Jan 19, 2016 at 4:03 PM Tim Chen <> wrote:

> Hi Sathish,
> Sorry about that, I think that's a good idea and I'll write up a section
> in the Spark documentation page to explain how it can work. We (Mesosphere)
> have been doing this for our DCOS spark for our past releases and has been
> working well so far.
> Thanks!
> Tim
> On Tue, Jan 19, 2016 at 12:28 PM, Sathish Kumaran Vairavelu <
>> wrote:
>> Hi Tim
>> Do you have any materials/blog for running Spark in a container in Mesos
>> cluster environment? I have googled it but couldn't find info on it. Spark
>> documentation says it is possible, but no details provided.. Please help
>> Thanks
>> Sathish
>> On Mon, Sep 21, 2015 at 11:54 AM Tim Chen <> wrote:
>>> Hi John,
>>> There is no other blog post yet, I'm thinking to do a series of posts
>>> but so far haven't get time to do that yet.
>>> Running Spark in docker containers makes distributing spark versions
>>> easy, it's simple to upgrade and automatically caches on the slaves so the
>>> same image just runs right away. Most of the docker perf is usually related
>>> to network and filesystem overheads, but I think with recent changes in
>>> Spark to make Mesos sandbox the default temp dir filesystem won't be a big
>>> concern as it's mostly writing to the mounted in Mesos sandbox. Also Mesos
>>> uses host network by default so network is affected much.
>>> Most of the cluster mode limitation is that you need to make the spark
>>> job files available somewhere that all the slaves can access remotely
>>> (http, s3, hdfs, etc) or available on all slaves locally by path.
>>> I'll try to make more doc efforts once I get my existing patches and
>>> testing infra work done.
>>> Let me know if you have more questions,
>>> Tim
>>> On Sat, Sep 19, 2015 at 5:42 AM, John Omernik <> wrote:
>>>> I was searching in the 1.5.0 docs on the Docker on Mesos capabilities
>>>> and just found you CAN run it this way.  Are there any user posts, blog
>>>> posts, etc on why and how you'd do this?
>>>> Basically, at first I was questioning why you'd run spark in a docker
>>>> container, i.e., if you run with tar balled executor, what are you really
>>>> gaining?  And in this setup, are you losing out on performance somehow? (I
>>>> am guessing smarter people than I have figured that out).
>>>> Then I came along a situation where I wanted to use a python library
>>>> with spark, and it had to be installed on every node, and I realized one
>>>> big advantage of dockerized spark would be that spark apps that needed
>>>> other libraries could be contained and built well.
>>>> OK, that's huge, let's do that.  For my next question there are lot of
>>>> "questions" have on how this actually works.  Does Clustermode/client mode
>>>> apply here? If so, how?  Is there a good walk through on getting this
>>>> setup? Limitations? Gotchas?  Should I just dive in an start working with
>>>> it? Has anyone done any stories/rough documentation? This seems like a
>>>> really helpful feature to scaling out spark, and letting developers truly
>>>> build what they need without tons of admin overhead, so I really want to
>>>> explore.
>>>> Thanks!
>>>> John

Reply via email to