I've never tried to run a stand-alone cluster alongside hadoop, but why not
run Spark as a yarn application? That way it can absolutely (in fact
preferably) use the distributed file system.
On Fri, Nov 9, 2018 at 5:04 PM, Arijit Tarafdar wrote:
> Hello All,
>
>
>
> We have a requirement to run
Hello All,
We have a requirement to run PySpark in standalone cluster mode and also
reference python libraries (egg/wheel) which are not local but placed in a
distributed storage like HDFS. From the code it looks like none of cases are
supported.
Questions are:
1. Why is PySpark