At the risk of repeating myself, this is what I was hoping to avoid when I
suggested deploying a full, zipped, conda venv.
What is your motivation for running an install process on the nodes and
risking the process failing, instead of pushing a validated environment
artifact and not having that
Hi Patrick/Users,
I am exploring wheel file form packages for this , as this seems simple:-
https://bytes.grubhub.com/managing-dependencies-and-artifacts-in-pyspark-7641aa89ddb7
However, I am facing another issue:- I am using pandas , which needs numpy.
Numpy is giving error!
ImportError:
Wheel is used for package management and setting up your virtual
environment , not used as a library package. To run spark-submit in a
virtual env, use the --py-files option instead. Usage:
--py-files PY_FILES Comma-separated list of .zip, .egg, or .py
files to place on the
I'm not very familiar with the environments on cloud clusters, but in
general I'd be reluctant to lean on setuptools or other python install
mechanisms. In the worst case, you might encounter /usr/bin/pip not having
permissions to install new packages, or even if you do a package might
require
Hi Users
I have a wheel file , while creating it I have mentioned dependencies in
setup.py file.
Now I have 2 virtual envs, 1 was already there . another one I created just
now.
I have switched to new virtual env, I want spark to download the
dependencies while doing spark-submit using wheel.