Are you sure you are using Spark 2.2.0? Based on the stack-trace it looks like your call to spark-submit it using an older version of Spark (looks like some early 1.x version). Do you have SPARK_HOME set locally? Do you have older versions of Spark installed locally?
--Sahil On Tue, Sep 26, 2017 at 3:33 PM, Stephen Sprague <sprag...@gmail.com> wrote: > thanks Sahil. here it is. > > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/spark/scheduler/SparkListenerInterface > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:344) > at org.apache.spark.deploy.SparkSubmit$.launch( > SparkSubmit.scala:318) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > Caused by: java.lang.ClassNotFoundException: org.apache.spark.scheduler. > SparkListenerInterface > at java.net.URLClassLoader$1.run(URLClassLoader.java:372) > at java.net.URLClassLoader$1.run(URLClassLoader.java:361) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:360) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 5 more > > at > org.apache.hive.spark.client.rpc.RpcServer.cancelClient(RpcServer.java:212) > ~[hive-exec-2.3.0.jar:2.3.0] > at > org.apache.hive.spark.client.SparkClientImpl$3.run(SparkClientImpl.java:500) > ~[hive-exec-2.3.0.jar:2.3.0] > at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_25] > FAILED: SemanticException Failed to get a spark session: > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark > client. > 2017-09-26T14:04:46,470 ERROR [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 main] > ql.Driver: FAILED: SemanticException Failed to get a spark session: > org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark > client. > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a spark > session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to > create spark client. > at org.apache.hadoop.hive.ql.optimizer.spark. > SetSparkReducerParallelism.getSparkMemoryAndCores( > SetSparkReducerParallelism.java:240) > at org.apache.hadoop.hive.ql.optimizer.spark. > SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:173) > at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch( > DefaultRuleDispatcher.java:90) > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker. > dispatchAndReturn(DefaultGraphWalker.java:105) > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch( > DefaultGraphWalker.java:89) > at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk( > PreOrderWalker.java:56) > at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk( > PreOrderWalker.java:61) > at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk( > PreOrderWalker.java:61) > at org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk( > PreOrderWalker.java:61) > at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking( > DefaultGraphWalker.java:120) > at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler. > runSetReducerParallelism(SparkCompiler.java:288) > at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler. > optimizeOperatorPlan(SparkCompiler.java:122) > at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile( > TaskCompiler.java:140) > at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer. > analyzeInternal(SemanticAnalyzer.java:11253) > at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal( > CalcitePlanner.java:286) > at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer. > analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:511) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver. > java:1316) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1456) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1236) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1226) > at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd( > CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd( > CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine( > CliDriver.java:403) > at org.apache.hadoop.hive.cli.CliDriver.processLine( > CliDriver.java:336) > at org.apache.hadoop.hive.cli.CliDriver.executeDriver( > CliDriver.java:787) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke( > NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke( > DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > > > I bugs me that that class is in spark-core_2.11-2.2.0.jar yet so seemingly > out of reach. :( > > > > On Tue, Sep 26, 2017 at 2:44 PM, Sahil Takiar <takiar.sa...@gmail.com> > wrote: > >> Hey Stephen, >> >> Can you send the full stack trace for the NoClassDefFoundError? For Hive >> 2.3.0, we only support Spark 2.0.0. Hive may work with more recent versions >> of Spark, but we only test with Spark 2.0.0. >> >> --Sahil >> >> On Tue, Sep 26, 2017 at 2:35 PM, Stephen Sprague <sprag...@gmail.com> >> wrote: >> >>> * i've installed hive 2.3 and spark 2.2 >>> >>> * i've read this doc plenty of times -> https://cwiki.apache.org/confl >>> uence/display/Hive/Hive+on+Spark%3A+Getting+Started >>> >>> * i run this query: >>> >>> hive --hiveconf hive.root.logger=DEBUG,console -e 'set >>> hive.execution.engine=spark; select date_key, count(*) from >>> fe_inventory.merged_properties_hist group by 1 order by 1;' >>> >>> >>> * i get this error: >>> >>> * Exception in thread "main" java.lang.NoClassDefFoundError: >>> org/apache/spark/scheduler/SparkListenerInterface* >>> >>> >>> * this class in: >>> /usr/lib/spark-2.2.0-bin-hadoop2.6/jars/spark-core_2.11-2.2.0.jar >>> >>> * i have copied all the spark jars to hdfs://dwrdevnn1/spark-2.2-jars >>> >>> * i have updated hive-site.xml to set spark.yarn.jars to it. >>> >>> * i see this is the console: >>> >>> 2017-09-26T13:34:15,505 INFO [334aa7db-ad0c-48c3-9ada-467aaf05cff3 >>> main] spark.HiveSparkClientFactory: load spark property from hive >>> configuration (spark.yarn.jars -> hdfs://dwrdevnn1.sv2.trulia.co >>> m:8020/spark-2.2-jars/*). >>> >>> * i see this on the console >>> >>> 2017-09-26T14:04:45,678 INFO [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 >>> main] client.SparkClientImpl: Running client driver with argv: >>> /usr/lib/spark-2.2.0-bin-hadoop2.6/bin/spark-submit --properties-file >>> /tmp/spark-submit.6105784757200912217.properties --class >>> org.apache.hive.spark.client.RemoteDriver >>> /usr/lib/apache-hive-2.3.0-bin/lib/hive-exec-2.3.0.jar >>> --remote-host dwrdevnn1.sv2.trulia.com --remote-port 53393 --conf >>> hive.spark.client.connect.timeout=1000 --conf >>> hive.spark.client.server.connect.timeout=90000 --conf >>> hive.spark.client.channel.log.level=null --conf >>> hive.spark.client.rpc.max.size=52428800 --conf >>> hive.spark.client.rpc.threads=8 --conf hive.spark.client.secret.bits=256 >>> --conf hive.spark.client.rpc.server.address=null >>> >>> * i even print out CLASSPATH in this script: >>> /usr/lib/spark-2.2.0-bin-hadoop2.6/bin/spark-submit >>> >>> and /usr/lib/spark-2.2.0-bin-hadoop2.6/jars/spark-core_2.11-2.2.0.jar >>> is in it. >>> >>> so i ask... what am i missing? >>> >>> thanks, >>> Stephen >>> >>> >>> >>> >>> >>> >> >> >> -- >> Sahil Takiar >> Software Engineer at Cloudera >> takiar.sa...@gmail.com | (510) 673-0309 >> > > -- Sahil Takiar Software Engineer at Cloudera takiar.sa...@gmail.com | (510) 673-0309