RE: [Spark 1.3.1 on YARN on EMR] Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-06-20 Thread Andrew Lee
Hi Roberto, I'm not an EMR person, but it looks like option -h is deploying the necessary dataneucleus JARs for you.The req for HiveContext is the hive-site.xml and dataneucleus JARs. As long as these 2 are there, and Spark is compiled with -Phive, it should work. spark-shell runs in

Re: [Spark 1.3.1 on YARN on EMR] Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-06-20 Thread Bozeman, Christopher
We worked it out. There was multiple items (like location of remote metastore and db user auth) to make HiveContext happy in yarn-cluster mode. For reference https://github.com/awslabs/emr-bootstrap-actions/blob/master/spark/examples/using-hivecontext-yarn-cluster.md -Christopher Bozeman On

Re: [Spark 1.3.1 on YARN on EMR] Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-06-20 Thread Roberto Coluccio
I confirm, Christopher was very kind helping me out here. The solution presented in the linked doc worked perfectly. IMO it should be linked in the official Spark documentation. Thanks again, Roberto On 20 Jun 2015, at 19:25, Bozeman, Christopher bozem...@amazon.com wrote: We worked it

[Spark 1.3.1 on YARN on EMR] Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

2015-06-10 Thread Roberto Coluccio
Hi! I'm struggling with an issue with Spark 1.3.1 running on YARN, running on an AWS EMR cluster. Such cluster is based on AMI 3.7.0 (hence Amazon Linux 2015.03, Hive 0.13 already installed and configured on the cluster, Hadoop 2.4, etc...). I make use of the AWS emr-bootstrap-action