Thanks for the confirmation - I was just going to send a pointer to the documentation that talks about hive-site.xml. http://people.apache.org/~pwendell/spark-releases/latest/sql-programming-guide.html#hive-tables
Thanks Shivaram On Mon, Jun 8, 2015 at 1:57 PM, Eskilson,Aleksander < alek.eskil...@cerner.com> wrote: > Resolved, my hive-site.xml wasn’t in the conf folder. I can load tables > into DataFrames as expected. > > Thanks, > Alek > > From: <Eskilson>, Aleksander Eskilson <alek.eskil...@cerner.com> > Date: Monday, June 8, 2015 at 3:38 PM > To: "dev@spark.apache.org" <dev@spark.apache.org> > Subject: SparkR Reading Tables from Hive > > Hi there, > > I’m testing out the new SparkR-Hive interop right now. I’m noticing an > apparent disconnect between the Hive store I have my data loaded and the > store that sparkRHIve.init() connects to. For example, in beeline: > > 0: jdbc:hive2://quickstart.cloudera:10000> show databases; > +---------------+--+ > | database_name | > +---------------+--+ > | default | > +---------------+--+ > 0: jdbc:hive2://quickstart.cloudera:10000> show tables; > +---------------+--+ > | tab_name | > +---------------+--+ > | my_table | > +---------------+--+ > > But in sparkR: > > > hqlContext <- sparkRHive.init(sc) > > showDF(sql(hqlContext, “show databases”)) > +---------+ > | result | > +---------+ > | default | > +---------+ > > showDF(tables(hqlContext, “default”)) > +-----------+-------------+ > + tableName | isTemporary | > +-----------+-------------+ > +-----------+-------------+ > > showDF(sql(hqlContext, “show tables”)) > +-----------+-------------+ > + tableName | isTemporary | > +-----------+-------------+ > +-----------+-------------+ > > The data in my_table was landed into Hive from a CSV via kite-dataset. > The installation of Spark I’m working with was built separately, and > operates as standalone. Could it be that sparkRHive.init() is getting the > wrong address of the Hive metastore? How could I peer into the context and > see what the address is set to, and if it’s wrong, reset it? > > Ultimately, I’d like to be able to read my_table from Hive into a SparkR > DataFrame which ought to be possible with > > result <- sql(hqlContext, “SELECT * FROM my_table”) > But this fails with: > org.apache.spark.sql.AnalysisException: no such table my_table; line 1 pos > 14 > which is expected, I suppose, since we don’t see the table in the listing > above. > > Any thoughts? > > Thanks, > Alek Eskilson > CONFIDENTIALITY NOTICE This message and any included attachments are from > Cerner Corporation and are intended only for the addressee. The information > contained in this message is confidential and may constitute inside or > non-public information under international, federal, or state securities > laws. Unauthorized forwarding, printing, copying, distribution, or use of > such information is strictly prohibited and may be unlawful. If you are not > the addressee, please promptly delete this message and notify the sender of > the delivery error by e-mail or you may call Cerner's corporate offices in > Kansas City, Missouri, U.S.A at (+1) (816)221-1024. >