When one runs in Local mode (one JVM) on an edge host (the host user accesses the cluster), it is possible to put additional jar file say accessing Oracle RDBMS tables in $SPARK_CLASSPATH. This works
export SPARK_CLASSPATH=~/user_jars/ojdbc6.jar Normally a group of users can have read access to a shared directory like above and once they log in their shell will invoke an environment file that will have the above classpath plus additional parameters like $JAVA_HOME etc are set up for them. However, if the user chooses to run spark through spark-submit with yarn, then the only way this will work in my research is to add the jar file as follows on every node of Spark cluster in $SPARK_HOME/conf/spark-defaults.conf Add the jar path to the following: spark.executor.extraClassPath /user_jars/ojdbc6.jar Note that setting both spark.executor.extraClassPath and SPARK_CLASSPATH will cause initialisation error ERROR SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Found both spark.executor.extraClassPath and SPARK_CLASSPATH. Use only the former. I was wondering if there are other ways of making this work in YARN mode, where every node of cluster will require this JAR file? Thanks