Hi PySparkers,
What currently is the best way of shipping self-contained pyspark jobs
with 3rd-party dependencies?
There are some open JIRA issues [1], [2] as well as corresponding PRs
[3], [4] and articles [5], [6], [7] regarding setting up the python
environment with conda and virtualenv
Hi PySparkers,
What currently is the best way of shipping self-contained pyspark jobs with
3rd-party dependencies?
There are some open JIRA issues [1], [2] as well as corresponding PRs [3],
[4] and articles [5], [6], regarding setting up the python environment with
conda and virtualenv