I don't think a customer classloader is necessary. Well, it occurs to me that this is no new problem. Hadoop, Tomcat, etc all run custom user code that creates new user objects without reflection. I should go see how that's done. Maybe it's totally valid to set the thread's context classloader for just this purpose, and I am not thinking clearly.
On Mon, May 19, 2014 at 8:26 AM, Andrew Ash <and...@andrewash.com> wrote: > Sounds like the problem is that classloaders always look in their parents > before themselves, and Spark users want executors to pick up classes from > their custom code before the ones in Spark plus its dependencies. > > Would a custom classloader that delegates to the parent after first > checking itself fix this up? > > > On Mon, May 19, 2014 at 12:17 AM, DB Tsai <dbt...@stanford.edu> wrote: > >> Hi Sean, >> >> It's true that the issue here is classloader, and due to the classloader >> delegation model, users have to use reflection in the executors to pick up >> the classloader in order to use those classes added by sc.addJars APIs. >> However, it's very inconvenience for users, and not documented in spark. >> >> I'm working on a patch to solve it by calling the protected method addURL >> in URLClassLoader to update the current default classloader, so no >> customClassLoader anymore. I wonder if this is an good way to go. >> >> private def addURL(url: URL, loader: URLClassLoader){ >> try { >> val method: Method = >> classOf[URLClassLoader].getDeclaredMethod("addURL", classOf[URL]) >> method.setAccessible(true) >> method.invoke(loader, url) >> } >> catch { >> case t: Throwable => { >> throw new IOException("Error, could not add URL to system >> classloader") >> } >> } >> } >> >> >> >> Sincerely, >> >> DB Tsai >> ------------------------------------------------------- >> My Blog: https://www.dbtsai.com >> LinkedIn: https://www.linkedin.com/in/dbtsai >> >> >> On Sun, May 18, 2014 at 11:57 PM, Sean Owen <so...@cloudera.com> wrote: >> >> > I might be stating the obvious for everyone, but the issue here is not >> > reflection or the source of the JAR, but the ClassLoader. The basic >> > rules are this. >> > >> > "new Foo" will use the ClassLoader that defines Foo. This is usually >> > the ClassLoader that loaded whatever it is that first referenced Foo >> > and caused it to be loaded -- usually the ClassLoader holding your >> > other app classes. >> > >> > ClassLoaders can have a parent-child relationship. ClassLoaders always >> > look in their parent before themselves. >> > >> > (Careful then -- in contexts like Hadoop or Tomcat where your app is >> > loaded in a child ClassLoader, and you reference a class that Hadoop >> > or Tomcat also has (like a lib class) you will get the container's >> > version!) >> > >> > When you load an external JAR it has a separate ClassLoader which does >> > not necessarily bear any relation to the one containing your app >> > classes, so yeah it is not generally going to make "new Foo" work. >> > >> > Reflection lets you pick the ClassLoader, yes. >> > >> > I would not call setContextClassLoader. >> > >> > On Mon, May 19, 2014 at 12:00 AM, Sandy Ryza <sandy.r...@cloudera.com> >> > wrote: >> > > I spoke with DB offline about this a little while ago and he confirmed >> > that >> > > he was able to access the jar from the driver. >> > > >> > > The issue appears to be a general Java issue: you can't directly >> > > instantiate a class from a dynamically loaded jar. >> > > >> > > I reproduced it locally outside of Spark with: >> > > --- >> > > URLClassLoader urlClassLoader = new URLClassLoader(new URL[] { new >> > > File("myotherjar.jar").toURI().toURL() }, null); >> > > Thread.currentThread().setContextClassLoader(urlClassLoader); >> > > MyClassFromMyOtherJar obj = new MyClassFromMyOtherJar(); >> > > --- >> > > >> > > I was able to load the class with reflection. >> > >>