Hello Michael, Thanks for your time. 1. show tables from Spark program returns nothing. 2. What entities are you talking about ? (I am actually new to Hive as well)
On Thu, Mar 26, 2015 at 8:35 PM, Michael Armbrust <mich...@databricks.com> wrote: > What does "show tables" return? You can also run "SET <optionName>" to > make sure that entries from you hive site are being read correctly. > > On Thu, Mar 26, 2015 at 4:02 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> > wrote: > >> I have tables dw_bid that is created in Hive and has nothing to do with >> Spark. I have data in avro that i want to join with dw_bid table, this >> join needs to be done using Spark SQL. However for some reason Spark says >> dw_bid table does not exist. How do i say spark that dw_bid is a table >> created in Hive and read it. >> >> >> Query that is run from Spark SQL >> ============================== >> insert overwrite table sojsuccessevents2_spark select >> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId, >> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as >> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId, >> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort, >> isDuplicate,b.bid_date as >> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid >> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as >> >> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId, >> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a >> join dw_bid b on a.itemId = b.item_id and a.transactionId = >> b.transaction_id where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >= >> '2015-02-16' AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( >> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin') >> >> >> If i create sojsuccessevents2_spark from hive command line and run above >> command form Spark SQL program then i get error "sojsuccessevents2_spark >> table not found". >> >> Hence i dropped the command from Hive and run create table >> sojsuccessevents2_spark from Spark SQL before running above command and it >> works until it hits next road block "dw_bid table not found" >> >> This makes me belive that Spark for some reason is not able to >> read/understand the tables created outside Spark. I did copy >> /apache/hive/conf/hive-site.xml into Spark conf directory. >> >> Please suggest. >> >> >> Logs >> ——— >> 15/03/26 03:50:40 INFO HiveMetaStore.audit: ugi=dvasthimal >> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid >> 15/03/26 03:50:40 ERROR metadata.Hive: >> NoSuchObjectException(message:default.dw_bid table not found) >> at >> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560) >> >> >> >> 15/03/26 03:50:40 ERROR yarn.ApplicationMaster: User class threw >> exception: no such table List(dw_bid); line 1 pos 843 >> org.apache.spark.sql.AnalysisException: no such table List(dw_bid); line >> 1 pos 843 >> at >> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42) >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:178) >> at >> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$6.applyOrElse(Analyzer.scala:187) >> >> >> >> Regards, >> Deepak >> >> >> On Thu, Mar 26, 2015 at 4:27 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >> wrote: >> >>> I have this query >>> >>> insert overwrite table sojsuccessevents2_spark select >>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId, >>> shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id as >>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId, >>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort, >>> isDuplicate,b.bid_date as >>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid >>> as bidQuantity, b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as >>> >>> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId, >>> sellerStdLevel,cssSellerLevel,a.experimentChannel from sojsuccessevents1 a >>> *join >>> dw_bid b* on a.itemId = b.item_id and a.transactionId = >>> b.transaction_id where b.auct_end_dt >= '2015-02-16' AND b.bid_dt >= >>> '2015-02-16' AND b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( >>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin') >>> >>> >>> If i create sojsuccessevents2_spark from hive command line and run above >>> command form Spark SQL program then i get error "sojsuccessevents2_spark >>> table not found". >>> >>> Hence i dropped the command from Hive and run create table >>> sojsuccessevents2_spark from Spark SQL before running above command and it >>> works until it hits next road block "*dw_bid table not found"* >>> >>> This makes me belive that Spark for some reason is not able to >>> read/understand the tables created outside Spark. I did copy >>> /apache/hive/conf/hive-site.xml into Spark conf directory. >>> >>> Please suggest. >>> >>> Regards, >>> Deepak >>> >>> >>> On Thu, Mar 26, 2015 at 1:26 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >>> wrote: >>> >>>> I have a hive table named dw_bid, when i run hive from command prompt >>>> and run describe dw_bid, it works. >>>> >>>> I want to join a avro file (table) in HDFS with this hive dw_bid table >>>> and i refer it as dw_bid from Spark SQL program, however i see >>>> >>>> 15/03/26 00:31:01 INFO HiveMetaStore.audit: ugi=dvasthimal >>>> ip=unknown-ip-addr cmd=get_table : db=default tbl=dw_bid >>>> 15/03/26 00:31:01 ERROR metadata.Hive: >>>> NoSuchObjectException(message:default.dw_bid table not found) >>>> at >>>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1560) >>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>> >>>> >>>> Code: >>>> >>>> val successDetail_S1 = sqlContext.avroFile(input) >>>> successDetail_S1.registerTempTable("sojsuccessevents1") >>>> val countS1 = sqlContext.sql("select >>>> guid,sessionKey,sessionStartDate,sojDataDate,seqNum,eventTimestamp,siteId,successEventType,sourceType,itemId," >>>> + >>>> " shopCartId,b.transaction_Id as transactionId,offerId,b.bdr_id >>>> as >>>> userId,priorPage1SeqNum,priorPage1PageId,exclWMSearchAttemptSeqNum,exclPriorSearchPageId," >>>> + >>>> " >>>> exclPriorSearchSeqNum,exclPriorSearchCategory,exclPriorSearchL1,exclPriorSearchL2,currentImpressionId,sourceImpressionId,exclPriorSearchSqr,exclPriorSearchSort," >>>> + >>>> " isDuplicate,b.bid_date as >>>> transactionDate,auctionTypeCode,isBin,leafCategoryId,itemSiteId,b.qty_bid >>>> as bidQuantity," + >>>> " b.bid_amt_unit_lstg_curncy * b.bid_exchng_rate as >>>> >>>> bidAmtUsd,offerQuantity,offerAmountUsd,offerCreateDate,buyerSegment,buyerCountryId,sellerId,sellerCountryId," >>>> + >>>> " sellerStdLevel,cssSellerLevel,a.experimentChannel" + >>>> " from sojsuccessevents1 a join dw_bid b " + >>>> " on a.itemId = b.item_id and a.transactionId = >>>> b.transaction_id " + >>>> " where b.bid_type_code IN (1,9) AND b.bdr_id > 0 AND ( >>>> b.bid_flags & 32) = 0 and lower(a.successEventType) IN ('bid','bin')") >>>> println("countS1.first:" + countS1.first) >>>> >>>> >>>> >>>> Any suggestions on how to refer a hive table form Spark SQL? >>>> -- >>>> >>>> Deepak >>>> >>>> >>> >>> >>> -- >>> Deepak >>> >>> >> >> >> -- >> Deepak >> >> > -- Deepak