I was searching in the 1.5.0 docs on the Docker on Mesos capabilities and
just found you CAN run it this way.  Are there any user posts, blog posts,
etc on why and how you'd do this?

Basically, at first I was questioning why you'd run spark in a docker
container, i.e., if you run with tar balled executor, what are you really
gaining?  And in this setup, are you losing out on performance somehow? (I
am guessing smarter people than I have figured that out).

Then I came along a situation where I wanted to use a python library with
spark, and it had to be installed on every node, and I realized one big
advantage of dockerized spark would be that spark apps that needed other
libraries could be contained and built well.

OK, that's huge, let's do that.  For my next question there are lot of
"questions" have on how this actually works.  Does Clustermode/client mode
apply here? If so, how?  Is there a good walk through on getting this
setup? Limitations? Gotchas?  Should I just dive in an start working with
it? Has anyone done any stories/rough documentation? This seems like a
really helpful feature to scaling out spark, and letting developers truly
build what they need without tons of admin overhead, so I really want to
explore.

Thanks!

John

Reply via email to