Hi everyone,

I¹m running into a strange class loading issue when running a Spark job,
using Spark 1.0.2.

I¹m running a process where some Java code is compiled dynamically into a
jar and added to the Spark context via addJar(). It is also added to the
class loader of the thread that created the Spark context. When I try to run
any job that only references the dynamically-compiled class on the workers,
and then convert them to some other value (say integers) before collecting
the result at the driver, the job completes successfully.

We¹re using Kryo serialization. When I try to run a similar workflow but
request for objects containing fields of the type of the dynamically
compiled class at the driver (say by collect()) the job breaks with the
following exception:

"Caused by: java.lang.RuntimeException:
java.security.PrivilegedActionException: org.apache.spark.SparkException:
Job aborted due to stage failure: Exception while getting task result:
com.esotericsoftware.kryo.KryoException: Unable to find class: TBEjava1"

Where ³TBEjava1² is the name of the dynamically compiled class.

Here is what I can deduce from my debugging:
1. The class is loaded on the thread that launches the Spark context and
calls reduce(). To check, I put a breakpoint right before my reduce() call
and used Class.forName("TBEjava1", false,
Thread.currentThread().getContextClassLoader()); and got back a valid class
object without ClassNotFoundException being thrown.
2. The worker threads can also refer to the class. I put breakpoints in the
worker methods (using local[N] mode for the context for now) and they
complete the mapping and reducing functions successfully.
3. The Kryo serializer calls readClass() and then calls Class.forName(Š)
inside Kryo, using the class loader in that Kryo object. The class not found
exception is thrown there, however the stack trace doesn¹t appear as such.
I¹m wondering if we might be running into
https://issues.apache.org/jira/browse/SPARK-3046 or something. I looked a
bit at the Spark code, and from what I understand, the thread pool created
for task result getter does not inherit the context class loader of the
thread that created the Spark Context, which would explain why the task
result getter threads can¹t find classes even though they are available via
addJar().

Any suggestions for a workaround? Feel free to correct me on any incorrect
observations I¹ve made as well.

Thanks,

-Matt Cheah


Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to