You can probably avoid the problem by set environment variable SPARK_HOME or JVM property spark.home that points to your spark installation.
--Xuefu On Thu, Mar 10, 2016 at 3:11 AM, Stana <st...@is-land.com.tw> wrote: > I am trying out Hive on Spark with hive 2.0.0 and spark 1.4.1, and > executing org.apache.hadoop.hive.ql.Driver with java application. > > Following are my situations: > 1.Building spark 1.4.1 assembly jar without Hive . > 2.Uploading the spark assembly jar to the hadoop cluster. > 3.Executing the java application with eclipse IDE in my client computer. > > The application went well and it submitted mr job to the yarn cluster > successfully when using " hiveConf.set("hive.execution.engine", "mr") > ",but it threw exceptions in spark-engine. > > Finally, i traced Hive source code and came to the conclusion: > > In my situation, SparkClientImpl class will generate the spark-submit > shell and executed it. > The shell command allocated --class with RemoteDriver.class.getName() > and jar with SparkContext.jarOfClass(this.getClass()).get(), so that > my application threw the exception. > > Is it right? And how can I do to execute the application with > spark-engine successfully in my client computer ? Thanks a lot! > > > Java application code: > > public class TestHiveDriver { > > private static HiveConf hiveConf; > private static Driver driver; > private static CliSessionState ss; > public static void main(String[] args){ > > String sql = "select * from hadoop0263_0 as a join > hadoop0263_0 as b > on (a.key = b.key)"; > ss = new CliSessionState(new HiveConf(SessionState.class)); > hiveConf = new HiveConf(Driver.class); > hiveConf.set("fs.default.name", "hdfs://storm0:9000"); > hiveConf.set("yarn.resourcemanager.address", > "storm0:8032"); > hiveConf.set("yarn.resourcemanager.scheduler.address", > "storm0:8030"); > > hiveConf.set("yarn.resourcemanager.resource-tracker.address","storm0:8031"); > hiveConf.set("yarn.resourcemanager.admin.address", > "storm0:8033"); > hiveConf.set("mapreduce.framework.name", "yarn"); > hiveConf.set("mapreduce.johistory.address", > "storm0:10020"); > > hiveConf.set("javax.jdo.option.ConnectionURL","jdbc:mysql://storm0:3306/stana_metastore"); > > hiveConf.set("javax.jdo.option.ConnectionDriverName","com.mysql.jdbc.Driver"); > hiveConf.set("javax.jdo.option.ConnectionUserName", > "root"); > hiveConf.set("javax.jdo.option.ConnectionPassword", > "123456"); > hiveConf.setBoolean("hive.auto.convert.join",false); > hiveConf.set("spark.yarn.jar", > "hdfs://storm0:9000/tmp/spark-assembly-1.4.1-hadoop2.6.0.jar"); > hiveConf.set("spark.home","target/spark"); > hiveConf.set("hive.execution.engine", "spark"); > hiveConf.set("hive.dbname", "default"); > > > driver = new Driver(hiveConf); > SessionState.start(hiveConf); > > CommandProcessorResponse res = null; > try { > res = driver.run(sql); > } catch (CommandNeedRetryException e) { > // TODO Auto-generated catch block > e.printStackTrace(); > } > > System.out.println("Response Code:" + > res.getResponseCode()); > System.out.println("Error Message:" + > res.getErrorMessage()); > System.out.println("SQL State:" + res.getSQLState()); > > } > } > > > > > Exception of spark-engine: > > 16/03/10 18:32:58 INFO SparkClientImpl: Running client driver with > argv: > /Volumes/Sdhd/Documents/project/island/java/apache/hive-200-test/hive-release-2.0.0/itests/hive-unit/target/spark/bin/spark-submit > --properties-file > > /var/folders/vt/cjcdhms903x7brn1kbh558s40000gn/T/spark-submit.7697089826296920539.properties > --class org.apache.hive.spark.client.RemoteDriver > > /Users/stana/.m2/repository/org/apache/hive/hive-exec/2.0.0/hive-exec-2.0.0.jar > --remote-host MacBook-Pro.local --remote-port 51331 --conf > hive.spark.client.connect.timeout=1000 --conf > hive.spark.client.server.connect.timeout=90000 --conf > hive.spark.client.channel.log.level=null --conf > hive.spark.client.rpc.max.size=52428800 --conf > hive.spark.client.rpc.threads=8 --conf > hive.spark.client.secret.bits=256 > 16/03/10 18:33:09 INFO SparkClientImpl: 16/03/10 18:33:09 INFO Client: > 16/03/10 18:33:09 INFO SparkClientImpl: client token: N/A > 16/03/10 18:33:09 INFO SparkClientImpl: diagnostics: N/A > 16/03/10 18:33:09 INFO SparkClientImpl: ApplicationMaster host: > N/A > 16/03/10 18:33:09 INFO SparkClientImpl: ApplicationMaster RPC > port: -1 > 16/03/10 18:33:09 INFO SparkClientImpl: queue: default > 16/03/10 18:33:09 INFO SparkClientImpl: start time: 1457180833494 > 16/03/10 18:33:09 INFO SparkClientImpl: final status: UNDEFINED > 16/03/10 18:33:09 INFO SparkClientImpl: tracking URL: > http://storm0:8088/proxy/application_1457002628102_0043/ > 16/03/10 18:33:09 INFO SparkClientImpl: user: stana > 16/03/10 18:33:10 INFO SparkClientImpl: 16/03/10 18:33:10 INFO Client: > Application report for application_1457002628102_0043 (state: FAILED) > 16/03/10 18:33:10 INFO SparkClientImpl: 16/03/10 18:33:10 INFO Client: > 16/03/10 18:33:10 INFO SparkClientImpl: client token: N/A > 16/03/10 18:33:10 INFO SparkClientImpl: diagnostics: Application > application_1457002628102_0043 failed 1 times due to AM Container for > appattempt_1457002628102_0043_000001 exited with exitCode: -1000 > 16/03/10 18:33:10 INFO SparkClientImpl: For more detailed output, > check application tracking > page:http://storm0:8088/proxy/application_1457002628102_0043/Then, > click on links to logs of each attempt. > 16/03/10 18:33:10 INFO SparkClientImpl: Diagnostics: > java.io.FileNotFoundException: File > > file:/Users/stana/.m2/repository/org/apache/hive/hive-exec/2.0.0/hive-exec-2.0.0.jar > does not exist > 16/03/10 18:33:10 INFO SparkClientImpl: Failing this attempt. Failing > the application. > 16/03/10 18:33:10 INFO SparkClientImpl: ApplicationMaster host: > N/A > 16/03/10 18:33:10 INFO SparkClientImpl: ApplicationMaster RPC > port: -1 > 16/03/10 18:33:10 INFO SparkClientImpl: queue: default > 16/03/10 18:33:10 INFO SparkClientImpl: start time: 1457180833494 > 16/03/10 18:33:10 INFO SparkClientImpl: final status: FAILED > 16/03/10 18:33:10 INFO SparkClientImpl: tracking URL: > http://storm0:8088/cluster/app/application_1457002628102_0043 > 16/03/10 18:33:10 INFO SparkClientImpl: user: stana > 16/03/10 18:33:10 INFO SparkClientImpl: Exception in thread "main" > org.apache.spark.SparkException: Application > application_1457002628102_0043 finished with failed status > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.yarn.Client.run(Client.scala:920) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.yarn.Client$.main(Client.scala:966) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.yarn.Client.main(Client.scala) > 16/03/10 18:33:10 INFO SparkClientImpl: at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 16/03/10 18:33:10 INFO SparkClientImpl: at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > 16/03/10 18:33:10 INFO SparkClientImpl: at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 16/03/10 18:33:10 INFO SparkClientImpl: at > java.lang.reflect.Method.invoke(Method.java:606) > 16/03/10 18:33:10 INFO SparkClientImpl: at > > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) > 16/03/10 18:33:10 INFO SparkClientImpl: at > org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > 16/03/10 18:33:10 INFO SparkClientImpl: 16/03/10 18:33:10 INFO > ShutdownHookManager: Shutdown hook called > 16/03/10 18:33:10 INFO SparkClientImpl: 16/03/10 18:33:10 INFO > ShutdownHookManager: Deleting directory > > /private/var/folders/vt/cjcdhms903x7brn1kbh558s40000gn/T/spark-5b92ce20-b6f8-4832-8b15-5e98bd0e0705 > 16/03/10 18:33:10 WARN SparkClientImpl: Error while waiting for client > to connect. > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > Cancel client '5bda93c0-865b-48a8-b368-c2fcc30e81e8'. Error: Child > process exited before connecting back > at > io.netty.util.concurrent.AbstractFuture.get(AbstractFuture.java:37) > ~[netty-all-4.0.23.Final.jar:4.0.23.Final] > at > org.apache.hive.spark.client.SparkClientImpl.<init>(SparkClientImpl.java:101) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hive.spark.client.SparkClientFactory.createClient(SparkClientFactory.java:80) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.createRemoteClient(RemoteHiveSparkClient.java:98) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient.<init>(RemoteHiveSparkClient.java:94) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.HiveSparkClientFactory.createHiveSparkClient(HiveSparkClientFactory.java:63) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.open(SparkSessionImpl.java:55) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionManagerImpl.getSession(SparkSessionManagerImpl.java:114) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.exec.spark.SparkUtilities.getSparkSession(SparkUtilities.java:131) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:117) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:181) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:119) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10195) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229) > [hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > [hive-exec-2.0.0.jar:2.0.0] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:479) > [hive-exec-2.0.0.jar:?] > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:319) > [hive-exec-2.0.0.jar:?] > at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1255) > [hive-exec-2.0.0.jar:?] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1301) > [hive-exec-2.0.0.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1184) > [hive-exec-2.0.0.jar:?] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172) > [hive-exec-2.0.0.jar:?] > at > org.apache.hadoop.hive.ql.TestHiveDriver.main(TestHiveDriver.java:41) > [test-classes/:?] > Caused by: java.lang.RuntimeException: Cancel client > '5bda93c0-865b-48a8-b368-c2fcc30e81e8'. Error: Child process exited > before connecting back > at > org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:179) > ~[hive-exec-2.0.0.jar:2.0.0] > at > org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:450) > ~[hive-exec-2.0.0.jar:2.0.0] > at java.lang.Thread.run(Thread.java:745) ~[?:1.7.0_67] > 16/03/10 18:33:10 WARN SparkClientImpl: Child process exited with code 1. > FAILED: SemanticException Failed to get a spark session: > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create > spark client. > 16/03/10 18:33:10 ERROR Driver: FAILED: SemanticException Failed to > get a spark session: org.apache.hadoop.hive.ql.metadata.HiveException: > Failed to create spark client. > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a > spark session: org.apache.hadoop.hive.ql.metadata.HiveException: > Failed to create spark client. > at > org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:121) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:158) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120) > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.runJoinOptimizations(SparkCompiler.java:181) > at > org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:119) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:102) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10195) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:479) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:319) > at > org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1255) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1301) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1184) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1172) > at > org.apache.hadoop.hive.ql.TestHiveDriver.main(TestHiveDriver.java:41) >