from:"ar7"

Limiting Pyspark.daemons

2016-07-04 Thread ar7

Hi, I am currently using PySpark 1.6.1 in my cluster. When a pyspark application is run, the load on the workers seems to go more than what was given. When I ran top, I noticed that there were too many Pyspark.daemons processes running. There was another mail thread regarding the same:

Adding h5 files in a zip to use with PySpark

2016-06-15 Thread ar7

I am using PySpark 1.6.1 for my spark application. I have additional modules which I am loading using the argument --py-files. I also have a h5 file which I need to access from one of the modules for initializing the ApolloNet. Is there any way I could access those files from the modules if I put