subject:"hive on spark \- why is it so hard\?"

Re: hive on spark - why is it so hard?

2017-10-02 Thread Jörn Franke

You should try with TEZ+LLAP. Additionally you will need to compare different configurations. Finally just any comparison is meaningless. You should use queries, data and file formats that your users are using later. > On 2. Oct 2017, at 03:06, Stephen Sprague wrote: > > so... i made some pro

Re: hive on spark - why is it so hard?

2017-10-01 Thread Stephen Sprague

so... i made some progress after much copying of jar files around (as alluded to by Gopal previously on this thread). following the instructions here: https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started and doing this as instructed will leave off about a dozen or s

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague

ok.. getting further. seems now i have to deploy hive to all nodes in the cluster - don't think i had to do that before but not a big deal to do it now. for me: HIVE_HOME=/usr/lib/apache-hive-2.3.0-bin/ SPARK_HOME=/usr/lib/spark-2.2.0-bin-hadoop2.6 on all three nodes now. i started spar

Re: hive on spark - why is it so hard?

2017-09-27 Thread Stephen Sprague

thanks. I haven't had a chance to dig into this again today but i do appreciate the pointer. I'll keep you posted. On Wed, Sep 27, 2017 at 10:14 AM, Sahil Takiar wrote: > You can try increasing the value of hive.spark.client.connect.timeout. > Would also suggest taking a look at the HoS Remote

Re: hive on spark - why is it so hard?

2017-09-27 Thread Sahil Takiar

You can try increasing the value of hive.spark.client.connect.timeout. Would also suggest taking a look at the HoS Remote Driver logs. The driver gets launched in a YARN container (assuming you are running Spark in yarn-client mode), so you just have to find the logs for that container. --Sahil O

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague

i _seem_ to be getting closer. Maybe its just wishful thinking. Here's where i'm at now. 2017-09-26T21:10:38,892 INFO [stderr-redir-1] client.SparkClientImpl: 17/09/26 21:10:38 INFO rest.RestSubmissionClient: Server responded with CreateSubmissionResponse: 2017-09-26T21:10:38,892 INFO [stderr

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague

oh. i missed Gopal's reply. oy... that sounds foreboding. I'll keep you posted on my progress. On Tue, Sep 26, 2017 at 4:40 PM, Gopal Vijayaraghavan wrote: > Hi, > > > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a > spark session: org.apache.hadoop.hive.ql.metadata.HiveExc

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague

well this is the spark-submit line from above: 2017-09-26T14:04:45,678 INFO [4cb82b6d-9568-4518-8e00-f0cf7ac58cd3 main] client.SparkClientImpl: Running client driver with argv: */usr/li/spark-2.2.0-bin-**hadoop2.6/bin/spark-submit* and that's pretty clearly v2.2 I do have other versions of

Re: hive on spark - why is it so hard?

2017-09-26 Thread Gopal Vijayaraghavan

Hi, > org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get a spark > session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create > spark client. I get inexplicable errors with Hive-on-Spark unless I do a three step build. Build Hive first, use that version to build

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar

Are you sure you are using Spark 2.2.0? Based on the stack-trace it looks like your call to spark-submit it using an older version of Spark (looks like some early 1.x version). Do you have SPARK_HOME set locally? Do you have older versions of Spark installed locally? --Sahil On Tue, Sep 26, 2017

Re: hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague

thanks Sahil. here it is. Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/scheduler/SparkListenerInterface at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:344) at org.apache.spark.deploy.SparkSubmit$.launch(Spark

Re: hive on spark - why is it so hard?

2017-09-26 Thread Sahil Takiar

Hey Stephen, Can you send the full stack trace for the NoClassDefFoundError? For Hive 2.3.0, we only support Spark 2.0.0. Hive may work with more recent versions of Spark, but we only test with Spark 2.0.0. --Sahil On Tue, Sep 26, 2017 at 2:35 PM, Stephen Sprague wrote: > * i've installed hive

hive on spark - why is it so hard?

2017-09-26 Thread Stephen Sprague

* i've installed hive 2.3 and spark 2.2 * i've read this doc plenty of times -> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started * i run this query: hive --hiveconf hive.root.logger=DEBUG,console -e 'set hive.execution.engine=spark; select date_key, count(*) f

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

Re: hive on spark - why is it so hard?

hive on spark - why is it so hard?

13 matches

Site Navigation

Mail list logo

Footer information