Hey all - Curious at the best way to include python packages in my Spark installation. (Such as NLTK). Basically I am running on Mesos, and would like to find a way to include the package in the binary distribution in that I don't want to install packages on all nodes. We should be able to include in the distribution, right?.
I thought of using the Docker Mesos integration, but I have been unable to find information on this (see my other question on Docker/Mesos/Spark). Any other thoughts on the best way to include packages in Spark WITHOUT installing on each node would be appreciated! John