Hive,
I am trying out the Hive on Spark with hive 1.2.1 and spark 1.5.2. Could
someone help me on this? Thanks!
Following are my steps:
1. build spark 1.5.2 without Hive and Hive Thrift Server. At this point, I can
use it to submit application using spark-submit --master yarn-client
2. And the built spark assembly jar into $HIVE_HOME/lib
3. start hive and add the following parameters
hive > set spark.master=yarn-client
hive > set spark.executor.memory=512M
hive > set spark.driver.memory=512M
hive > set spark.executor.instances=1
4. Then I run a simple query : select count(1) from t1;
The job fails will following error:
===============================================================================
YARN executor launch context:
env:
CLASSPATH ->
{{PWD}}<CPS>{{PWD}}/__spark__.jar<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
SPARK_LOG_URL_STDERR ->
http://hadoop-Aspire-TC-606:8042/node/containerlogs/container_1452320323183_0007_01_000003/hadoop/stderr?start=-4096
SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1452320323183_0007
SPARK_YARN_CACHE_FILES_FILE_SIZES -> 142746538
SPARK_USER -> hadoop
SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE
SPARK_YARN_MODE -> true
SPARK_YARN_CACHE_FILES_TIME_STAMPS -> 1452496343550
SPARK_LOG_URL_STDOUT ->
http://hadoop-Aspire-TC-606:8042/node/containerlogs/container_1452320323183_0007_01_000003/hadoop/stdout?start=-4096
SPARK_YARN_CACHE_FILES ->
hdfs://hadoop.bit.com:9000/user/hadoop/.sparkStaging/application_1452320323183_0007/spark-assembly-1.5.2-hadoop2.6.0.jar#__spark__.jar
command:
{{JAVA_HOME}}/bin/java -server -XX:OnOutOfMemoryError='kill %p' -Xms512m
-Xmx512m
'-Dhive.spark.log.dir=/home/hadoop/software/bigdata/spark-1.5.2-bin-hadoop2.6.0-withouthive/logs/'
-Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=43675'
-Dspark.yarn.app.container.log.dir=<LOG_DIR>
org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
akka.tcp://[email protected]:43675/user/CoarseGrainedScheduler
--executor-id 2 --hostname hadoop-Aspire-TC-606 --cores 1 --app-id
application_1452320323183_0007 --user-class-path file:$PWD/__app__.jar 1>
<LOG_DIR>/stdout 2> <LOG_DIR>/stderr
===============================================================================
16/01/11 15:12:37 INFO impl.ContainerManagementProtocolProxy: Opening proxy :
hadoop-Aspire-TC-606:50804
16/01/11 15:12:40 INFO yarn.YarnAllocator: Completed container
container_1452320323183_0007_01_000003 (state: COMPLETE, exit status: 1)
16/01/11 15:12:40 INFO yarn.YarnAllocator: Container marked as failed:
container_1452320323183_0007_01_000003. Exit status: 1. Diagnostics: Exception
from container-launch.
Container id: container_1452320323183_0007_01_000003
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 1