Re: Unable to join table across data sources using sparkSQL

Ishwardeep Singh Fri, 08 May 2015 05:37:04 -0700

Finally got it working.

I was trying to access hive using the jdbc driver like I was trying to
access the terradata.


It took me some time to figure out that default sqlContext created by Spark
supported hive and it uses the hive-site.xml in spark conf folder to access
hive.

I had to use my database in hive.

spark-shell> sqlContext.sql("use terradata_live")

Then I registered by terradata database tables as temporary tables.

spark-shell>  val itemDF=
hc.load("jdbc",Map("url"->"jdbc:teradata://192.168.145.58/DBS_PORT=1025,DATABASE=BENCHQADS,LOB_SUPPORT=OFF,USER=
BENCHQADS,PASSWORD=****","dbtable" -> "item")) 

spark-shell> itemDF.registerTempTable("itemterra")

spark-shell> sqlContext.sql("select store_sales.* from store_sales join
itemterra on (store_sales.id = itemterra.sales_id)

But these seems to be some issue when I try to do the same using hive jdbc
driver. Another difference that I found was in printSchema() output.
printSchema() output for data frame created using hive driver prefixes the
column names with table name but the same does not happen for terradata
tables.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-join-table-across-data-sources-using-sparkSQL-tp22761p22816.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Unable to join table across data sources using sparkSQL

Reply via email to