I was searching in the 1.5.0 docs on the Docker on Mesos capabilities and just found you CAN run it this way. Are there any user posts, blog posts, etc on why and how you'd do this?
Basically, at first I was questioning why you'd run spark in a docker container, i.e., if you run with tar balled executor, what are you really gaining? And in this setup, are you losing out on performance somehow? (I am guessing smarter people than I have figured that out). Then I came along a situation where I wanted to use a python library with spark, and it had to be installed on every node, and I realized one big advantage of dockerized spark would be that spark apps that needed other libraries could be contained and built well. OK, that's huge, let's do that. For my next question there are lot of "questions" have on how this actually works. Does Clustermode/client mode apply here? If so, how? Is there a good walk through on getting this setup? Limitations? Gotchas? Should I just dive in an start working with it? Has anyone done any stories/rough documentation? This seems like a really helpful feature to scaling out spark, and letting developers truly build what they need without tons of admin overhead, so I really want to explore. Thanks! John