Hi Guys, I'm trying to execute a spark job using python, running on a cluster of Yarn (managed by cloudera manager). The python script is using a set of python programs installed in each member of cluster. This set of programs need an property file found by a local system path.
My problem is: When this script is sent, using spark-submit, the programs can't find this properties file. Running locally as stand-alone job, is no problem, the properties file is found. My questions is: 1 - What is the problem here ? 2 - In this scenario (an script running on a spark yarn cluster that use python programs that share same properties file) what is the best approach ? Thank's taka