-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/30107/
-----------------------------------------------------------
(Updated Jan. 22, 2015, 9:23 a.m.)
Review request for hive and Xuefu Zhang.
Changes
-------
Spark driver may need to load extra added class in 2 place, first, while
execute GetJobStatusJob, it need to deserialize SparkWork. Second, while
HiveInputFormat get splits, it need to deserialize MapWork.
Remote Driver execute AddJarJob in netty rpc thread directly as it's
SyncJobRquest, and execute GetJobStatusJob(which wraps spark job) with its
threadpool. HiveInputFormat get splits may happens in akka thread pool, as
Spark send message through akka between SparkContext and DAGScheduler. So we
may need to reset 2 threads classloader to enable this dynamic add jar in RSC.
Bugs: HIVE-9410
https://issues.apache.org/jira/browse/HIVE-9410
Repository: hive-git
Description
-------
The RemoteDriver does not contains added jar in it's classpath, so it would
failed to desrialize SparkWork due to NoClassFoundException. For Hive on MR,
while use add jar through Hive CLI, Hive add jar into CLI classpath(through
thread context classloader) and add it to distributed cache as well. Compare to
Hive on MR, Hive on Spark has an extra RemoteDriver componnet, we should add
added jar into it's classpath as well.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d7cb111
ql/src/java/org/apache/hadoop/hive/ql/exec/spark/RemoteHiveSparkClient.java
30a00a7
spark-client/src/main/java/org/apache/hive/spark/client/JobContext.java
00aa4ec
spark-client/src/main/java/org/apache/hive/spark/client/JobContextImpl.java
1eb3ff2
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java
5f9be65
spark-client/src/main/java/org/apache/hive/spark/client/SparkClientUtilities.java
PRE-CREATION
Diff: https://reviews.apache.org/r/30107/diff/
Testing
-------
Thanks,
chengxiang li