Re: Table not found: using jdbc console to query sparksql hive thriftserver

Du Li Thu, 11 Sep 2014 22:37:43 -0700

SchemaRDD has a method insertInto(table). When the table is partitioned, it 
would be more sensible and convenient to extend it with a list of partition key 
and values.

From: Denny Lee <denny.g....@gmail.com<mailto:denny.g....@gmail.com>>
Date: Thursday, September 11, 2014 at 6:39 PM
To: Du Li <l...@yahoo-inc.com<mailto:l...@yahoo-inc.com>>
Cc: "u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org>" 
<u...@spark.incubator.apache.org<mailto:u...@spark.incubator.apache.org>>, 
alexandria1101 
<alexandria.shea...@gmail.com<mailto:alexandria.shea...@gmail.com>>
Subject: Re: Table not found: using jdbc console to query sparksql hive 
thriftserver

It sort of depends on the definition of efficiently.  From a work flow 
perspective I would agree but from an I/O perspective, wouldn’t there be the 
same multi-pass from the standpoint of the Hive context needing to push the 
data into HDFS?  Saying this, if you’re pushing the data into HDFS and then 
creating Hive tables via load (vs. a reference point ala external tables), I 
would agree with you.

And thanks for correcting me, the registerTempTable is in the SqlContext.

On September 10, 2014 at 13:47:24, Du Li 
(l...@yahoo-inc.com<mailto:l...@yahoo-inc.com>) wrote:

Hi Denny,

There is a related question by the way.

I have a program that reads in a stream of RDD¹s, each of which is to be
loaded into a hive table as one partition. Currently I do this by first
writing the RDD¹s to HDFS and then loading them to hive, which requires
multiple passes of HDFS I/O and serialization/deserialization.

I wonder if it is possible to do it more efficiently with Spark 1.1
streaming + SQL, e.g., by registering the RDDs into a hive context so that
the data is loaded directly into the hive table in cache and meanwhile
visible to jdbc/odbc clients. In the spark source code, the method
registerTempTable you mentioned works on SqlContext instead of HiveContext.

Thanks,
Du

On 9/10/14, 1:21 PM, "Denny Lee" 
<denny.g....@gmail.com<mailto:denny.g....@gmail.com>> wrote:

>Actually, when registering the table, it is only available within the sc
>context you are running it in. For Spark 1.1, the method name is changed
>to RegisterAsTempTable to better reflect that.
>
>The Thrift server process runs under a different process meaning that it
>cannot see any of the tables generated within the sc context. You would
>need to save the sc table into Hive and then the Thrift process would be
>able to see them.
>
>HTH!
>
>> On Sep 10, 2014, at 13:08, alexandria1101
>><alexandria.shea...@gmail.com<mailto:alexandria.shea...@gmail.com>> wrote:
>>
>> I used the hiveContext to register the tables and the tables are still
>>not
>> being found by the thrift server. Do I have to pass the hiveContext to
>>JDBC
>> somehow?
>>
>>
>>
>> --
>> View this message in context:
>>http://apache-spark-user-list.1001560.n3.nabble.com/Table-not-found-using
>>-jdbc-console-to-query-sparksql-hive-thriftserver-tp13840p13922.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: 
>> user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
>> For additional commands, e-mail: 
>> user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: 
>user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
>For additional commands, e-mail: 
>user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>
>

Re: Table not found: using jdbc console to query sparksql hive thriftserver

Reply via email to