[
https://issues.apache.org/jira/browse/HIVE-9425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14288666#comment-14288666
]
Xin Hao commented on HIVE-9425:
-------------------------------
Double checked with Big-Bench Q1 case (includes the hql ‘ADD FILE
${env:BIG_BENCH_QUERIES_DIR}/Resources/bigbenchqueriesmr.jar;’), and it failed
based on latest code on Spark Branch.
Error message in hive log:
====================================================================================================
2015-01-23 10:19:21,205 INFO [main]: exec.Task
(SessionState.java:printInfo(852)) - set hive.exec.reducers.max=<number>
2015-01-23 10:19:21,205 INFO [main]: exec.Task
(SessionState.java:printInfo(852)) - In order to set a constant number of
reducers:
2015-01-23 10:19:21,206 INFO [main]: exec.Task
(SessionState.java:printInfo(852)) - set mapreduce.job.reduces=<number>
2015-01-23 10:19:21,208 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=SparkSubmitJob
from=org.apache.hadoop.hive.ql.exec.spark.SparkTask>
2015-01-23 10:19:21,278 INFO [main]: ql.Context
(Context.java:getMRScratchDir(328)) - New scratch dir is
hdfs://bhx1:8020/tmp/hive/root/0357a036-8988-489b-85cf-329023a567c7/hive_2015-01-23_10-18-27_797_5566502876180681874-1
2015-01-23 10:19:21,432 WARN [RPC-Handler-3]: rpc.RpcDispatcher
(RpcDispatcher.java:handleError(142)) - Received error
message:java.io.FileNotFoundException:
/HiveOnSpark/Big-Bench/engines/hive/queries/Resources/bigbenchqueriesmr.jar (No
such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at
org.spark-project.guava.common.io.Files$FileByteSource.openStream(Files.java:124)
at
org.spark-project.guava.common.io.Files$FileByteSource.openStream(Files.java:114)
at
org.spark-project.guava.common.io.ByteSource.copyTo(ByteSource.java:202)
at org.spark-project.guava.common.io.Files.copy(Files.java:436)
at org.apache.spark.HttpFileServer.addFileToDir(HttpFileServer.scala:72)
at org.apache.spark.HttpFileServer.addFile(HttpFileServer.scala:55)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:961)
at
org.apache.spark.api.java.JavaSparkContext.addFile(JavaSparkContext.scala:646)
at
org.apache.hive.spark.client.SparkClientImpl$AddFileJob.call(SparkClientImpl.java:553)
at
org.apache.hive.spark.client.RemoteDriver$DriverProtocol.handle(RemoteDriver.java:305)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hive.spark.client.rpc.RpcDispatcher.handleCall(RpcDispatcher.java:120)
at
org.apache.hive.spark.client.rpc.RpcDispatcher.channelRead0(RpcDispatcher.java:79)
at
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
at
io.netty.handler.codec.ByteToMessageCodec.channelRead(ByteToMessageCodec.java:108)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
at
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
at
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
at
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
at
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
at java.lang.Thread.run(Thread.java:745)
.
2015-01-23 10:19:21,606 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=SparkSubmitJob
start=1421979561208 end=1421979561606 duration=398
from=org.apache.hadoop.hive.ql.exec.spark.SparkTask>
====================================================================================================
> External Function Jar files are not available for Driver when running with
> yarn-cluster mode [Spark Branch]
> -----------------------------------------------------------------------------------------------------------
>
> Key: HIVE-9425
> URL: https://issues.apache.org/jira/browse/HIVE-9425
> Project: Hive
> Issue Type: Sub-task
> Components: spark-branch
> Reporter: Xiaomin Zhang
>
> 15/01/20 00:27:31 INFO cluster.YarnClusterScheduler:
> YarnClusterScheduler.postStartHook done
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar
> (java.io.FileNotFoundException: hive-exec-0.15.0-SNAPSHOT.jar (No such file
> or directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar
> (java.io.FileNotFoundException: opennlp-maxent-3.0.3.jar (No such file or
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar
> (java.io.FileNotFoundException: bigbenchqueriesmr.jar (No such file or
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar
> (java.io.FileNotFoundException: opennlp-tools-1.5.3.jar (No such file or
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 ERROR spark.SparkContext: Error adding jar
> (java.io.FileNotFoundException: jcl-over-slf4j-1.7.5.jar (No such file or
> directory)), was the --addJars option used?
> 15/01/20 00:27:31 INFO client.RemoteDriver: Received job request
> fef081b0-5408-4804-9531-d131fdd628e6
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.max.split.size is
> deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
> 15/01/20 00:27:31 INFO Configuration.deprecation: mapred.min.split.size is
> deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
> 15/01/20 00:27:31 INFO client.RemoteDriver: Failed to run job
> fef081b0-5408-4804-9531-d131fdd628e6
> org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find
> class: de.bankmark.bigbench.queries.q10.SentimentUDF
> Serialization trace:
> genericUDTF (org.apache.hadoop.hive.ql.plan.UDTFDesc)
> conf (org.apache.hadoop.hive.ql.exec.UDTFOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
> childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
> aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
> invertedWorkGraph (org.apache.hadoop.hive.ql.plan.SparkWork)
> at
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
> at
> org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
> It seems the additional Jar files are not uploaded to DistributedCache, so
> that the Driver cannot access it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)