Hello everyone,
I recently encountered a situation where I needed to add a custom
classpath resource to my driver and access it from an included library
(specifically a configuration file for a custom Dataframe Reader).
I need to use it from both inside an application which I submit to the
cluster using /spark-submit --deploy-mode cluster/ as well as from the
spark shell when doing computations manually.
Adding it to the application was easy. I just distributed the file
throughout the cluster using /--files <path_to_file>/ and added it to
the driver's classpath using /--driver-class-path <filename>/. The file
would then be obtainable using, for example,
/Thread.currentThread.getContextClassLoader.getResourceAsStream//(...)/.
The problem was when I wanted to do the same thing in the Spark shell. I
tried adding the the file to the classpath using the same way as well as
/--driver-class-path <path_to_file>/, etc., but I cannot access it in
any way. Either using the library or directly from the shell.
How does the Spark shell's classloading facilities work and how should I
solve this problem?
Thanks,
Michal