Github user sryza commented on a diff in the pull request:
https://github.com/apache/spark/pull/119#discussion_r10507644
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -130,6 +130,16 @@ class SparkContext(
val isLocal = (master == "local" || master.startsWith("local["))
+ // Create a classLoader for use by the driver so that jars added via
addJar are available to the
+ // driver. Do this before all other initialization so that any thread
pools created for this
+ // SparkContext uses the class loader.
+ // Note that this is config-enabled as classloaders can introduce subtle
side effects
+ private[spark] val classLoader = if
(conf.getBoolean("spark.driver.loadAddedJars", false)) {
+ val loader = new SparkURLClassLoader(Array.empty[URL],
this.getClass.getClassLoader)
+ Thread.currentThread.setContextClassLoader(loader)
--- End diff --
Ah, ok, I understand now. In that case, to make things simpler, would it
possibly make sense to not load the jars to the current thread and only load
them for the SparkContext/executors? Classloader stuff can be confusing to
deal with and keeping it as isolated as possible could make things easier for
users. This would also line up a little more with how the MR distributed cache
works - jars that get added to it don't become accessible for to driver code.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---