Spark connecting to Hive in another EMR cluster

2016-06-24 Thread Dave Maughan
Hi, We're trying to get a Spark (1.6.1) job running on EMR (4.7.1) that's connecting to the Hive metastore in another EMR cluster. A simplification of what we're doing is below val sparkConf = new SparkConf().setAppName("MyApp") val sc = new SparkContext(sparkConf) val sqlContext = new

Re: Spark SQL - Encoders - case class

2016-06-06 Thread Dave Maughan
Hi, Thanks for the quick replies. I've tried those suggestions but Eclipse is showing: *Unable** to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._ Support for serializing other

Spark SQL - Encoders - case class

2016-06-06 Thread Dave Maughan
Hi, I've figured out how to select data from a remote Hive instance and encode the DataFrame -> Dataset using a Java POJO class: TestHive.sql("select foo_bar as `fooBar` from table1" ).as(Encoders.bean(classOf[Table1])).show() However, I'm struggling to find out to do the equivalent in

Re: spark 1.6.0 connect to hive metastore

2016-03-09 Thread Dave Maughan
Hi, We're having a similar issue. We have a standalone cluster running 1.5.2 with Hive working fine having dropped hive-site.xml into the conf folder. We've just updated to 1.6.0, using the same configuration. Now when starting a spark-shell we get the following: java.lang.RuntimeException: