Hi wonderful Python + Spark folks, I'm excited to announce that with Spark 2.2.0 we finally have PySpark published on PyPI (see https://pypi.python.org/pypi/pyspark / https://twitter.com/holdenkarau/status/885207416173756417). This has been a long time coming (previous releases included pip installable artifacts that for a variety of reasons couldn't be published to PyPI). So if you (or your friends) want to be able to work with PySpark locally on your laptop you've got an easier path getting started (pip install pyspark).
If you are setting up a standalone cluster your cluster will still need the "full" Spark packaging, but the pip installed PySpark should be able to work with YARN or an existing standalone cluster installation (of the same version). Happy Sparking y'all! Holden :) -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau