Re: Questions on Python support with Spark

2018-11-12 Thread Patrick McCarthy
I've never tried to run a stand-alone cluster alongside hadoop, but why not run Spark as a yarn application? That way it can absolutely (in fact preferably) use the distributed file system. On Fri, Nov 9, 2018 at 5:04 PM, Arijit Tarafdar wrote: > Hello All, > > > > We have a requirement to run

Questions on Python support with Spark

2018-11-09 Thread Arijit Tarafdar
Hello All, We have a requirement to run PySpark in standalone cluster mode and also reference python libraries (egg/wheel) which are not local but placed in a distributed storage like HDFS. From the code it looks like none of cases are supported. Questions are: 1. Why is PySpark