Hi,
I am currently using PySpark 1.6.1 in my cluster. When a pyspark application
is run, the load on the workers seems to go more than what was given. When I
ran top, I noticed that there were too many Pyspark.daemons processes
running. There was another mail thread regarding the same:
I am using PySpark 1.6.1 for my spark application. I have additional modules
which I am loading using the argument --py-files. I also have a h5 file
which I need to access from one of the modules for initializing the
ApolloNet.
Is there any way I could access those files from the modules if I put