Dan LaRocque created TINKERPOP3-988:
---------------------------------------
Summary: SparkGraphComputer.submit shouldn't use
ForkJoinPool.commonPool
Key: TINKERPOP3-988
URL: https://issues.apache.org/jira/browse/TINKERPOP3-988
Project: TinkerPop 3
Issue Type: Bug
Affects Versions: 3.1.0-incubating
Reporter: Dan LaRocque
{{SparkGraphComputer.submit}} delegates most of its work to a closure that
executes on the common forkjoin pool. The closure does a lot of stuff. It
calls into both Spark and Hadoop.
This approach has two problems:
1. Inability to customize the context classloader used within the closure
The context classloader of the thread that called {{submit}} is not necessarily
the same as the context classloader common forkjoin pool threads. This matters
because multiple bits of code reachable from {{submit}}'s closure rely on the
context classloader. SparkMemory is one; Hadoop's UserGroupInformation is
another, depending on the credentials configuration (UGI is reached indirectly
via {{FileSystem.get}}). This basically means that the caller has to use
whatever context classloader is currently in use by the fork join common pool,
or else bad things can happen, such as nonsensical-looking ClassCastExceptions.
2. Inability to override the context classloader inside the closure
When {{System.getSecurityManager() != null}}, the common forkjoin pool switches
from its default worker thread factory implementation to a more restrictive
alternative called InnocuousForkJoinWorkerThreadFactory. Threads created by
this factory can't call {{setContextClassLoader}}. Attempting to do so throws
a SecurityException. However, UserGroupInformation.newLoginContext must be
able to call {{setContextClassLoader}}. It saves the CCL to a variable, does
some work, then restores the CCL from a variable. This is impossible if the
method throws a SecurityException. So, if a security manager is present in the
VM, {{submit}}'s closure can die in {{FileSystem.get}} -> UGI before any useful
work even begins.
I set the Affects Version to the version on which I observed it, but it might
affect earlier versions too.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)