RE: Spark SQL using Hive metastore
Hi, Robert Spark SQL currently only support Hive 0.12.0(need to re-compile the package) and 0.13.1(by default), I am not so sure if it supports the Hive 0.14 metastore service as backend. Another way you can try is configure the $SPARK_HOME/conf/hive-site.xml to access the remote metastore database directly("javax.jdo.option.ConnectionURL” and “javax.jdo.option.ConnectionDriverName” required); And then you can start the Spark SQL like: bin/spark-sql --jars lib/mysql-connector-xxx.jar For the “SnappyError”, seems you didn’t configure the snappy native lib well for Spark, can you check the configuration file of $SPARK_HOME/conf/spark-xxx.conf ? Cheng Hao From: Grandl Robert [mailto:rgra...@yahoo.com.INVALID] Sent: Thursday, March 12, 2015 5:07 AM To: user@spark.apache.org Subject: Spark SQL using Hive metastore Hi guys, I am a newbie in running Spark SQL / Spark. My goal is to run some TPC-H queries atop Spark SQL using Hive metastore. It looks like spark 1.2.1 release has Spark SQL / Hive support. However, I am not able to fully connect all the dots. I did the following: 1. Copied hive-site.xml from hive to spark/conf 2. Copied mysql connector to spark/lib 3. I have started hive metastore service: hive --service metastore 3. I have started ./bin/spark-sql 4. I typed: spark-sql> show tables; However, the following error was thrown: Job 0 failed: collect at SparkPlan.scala:84, took 0.241788 s 15/03/11 15:02:35 ERROR SparkSQLDriver: Failed in [show tables] org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] no native library is found for os.name=Linux and os.arch=aarch64 Do you know what I am doing wrong ? I mention that I have hive-0.14 instead of hive-0.13. And another question: What is the right command to run sql queries with spark sql using hive metastore ? Thanks, Robert
RE: Spark SQL using Hive metastore
You need to include the Hadoop native library in your spark-shell/spark-sql, assuming your hadoop native library including native snappy library. spark-sql --driver-library-path point_to_your_hadoop_native_library In spark-sql, you can just use any command as you are in Hive CLI. Yong Date: Wed, 11 Mar 2015 21:06:54 + From: rgra...@yahoo.com.INVALID To: user@spark.apache.org Subject: Spark SQL using Hive metastore Hi guys, I am a newbie in running Spark SQL / Spark. My goal is to run some TPC-H queries atop Spark SQL using Hive metastore. It looks like spark 1.2.1 release has Spark SQL / Hive support. However, I am not able to fully connect all the dots. I did the following: 1. Copied hive-site.xml from hive to spark/conf2. Copied mysql connector to spark/lib3. I have started hive metastore service: hive --service metastore 3. I have started ./bin/spark-sql 4. I typed: spark-sql> show tables; However, the following error was thrown: Job 0 failed: collect at SparkPlan.scala:84, took 0.241788 s15/03/11 15:02:35 ERROR SparkSQLDriver: Failed in [show tables]org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] no native library is found for os.name=Linux and os.arch=aarch64 Do you know what I am doing wrong ? I mention that I have hive-0.14 instead of hive-0.13. And another question: What is the right command to run sql queries with spark sql using hive metastore ? Thanks,Robert
Spark SQL using Hive metastore
Hi guys, I am a newbie in running Spark SQL / Spark. My goal is to run some TPC-H queries atop Spark SQL using Hive metastore. It looks like spark 1.2.1 release has Spark SQL / Hive support. However, I am not able to fully connect all the dots. I did the following: 1. Copied hive-site.xml from hive to spark/conf2. Copied mysql connector to spark/lib3. I have started hive metastore service: hive --service metastore 3. I have started ./bin/spark-sql 4. I typed: spark-sql> show tables; However, the following error was thrown: Job 0 failed: collect at SparkPlan.scala:84, took 0.241788 s 15/03/11 15:02:35 ERROR SparkSQLDriver: Failed in [show tables] org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] no native library is found for os.name=Linux and os.arch=aarch64 Do you know what I am doing wrong ? I mention that I have hive-0.14 instead of hive-0.13. And another question: What is the right command to run sql queries with spark sql using hive metastore ? Thanks,Robert