Spark-shell throws Hive error when SQLContext.parquetFile, v1.3
Hello, sqlContext.parquetFile(dir) throws exception " Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient" The strange thing is that on the second attempt to open the file it is successful: try { sqlContext.parquetFile(dir) } catch { case e: Exception => sqlContext.parquetFile(dir) } What should I do to make my script to run flawlessly in spark-shell when opening parquetFiles. It is probably missing some dependency. Or how should I write the code because this double attempt is awfull and I don't need HiveMetaStoreClient, I just need to open parquet file. Many thanks for any idea, Petr
Re: Spark-shell throws Hive error when SQLContext.parquetFile, v1.3
If you don't need to interact with Hive, you may compile Spark without using the -Phive flag to eliminate Hive dependencies. In this way, the sqlContext instance in Spark shell will be of type SQLContext instead of HiveContext. The reason behind the Hive metastore error is probably due to Hive misconfiguration. Cheng On 9/10/15 6:02 PM, Petr Novak wrote: Hello, sqlContext.parquetFile(dir) throws exception " Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient" The strange thing is that on the second attempt to open the file it is successful: try { sqlContext.parquetFile(dir) } catch { case e: Exception => sqlContext.parquetFile(dir) } What should I do to make my script to run flawlessly in spark-shell when opening parquetFiles. It is probably missing some dependency. Or how should I write the code because this double attempt is awfull and I don't need HiveMetaStoreClient, I just need to open parquet file. Many thanks for any idea, Petr - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark-shell throws Hive error when SQLContext.parquetFile, v1.3
In addition to Cheng's comment -- I found the similar problem when hive-site.xml is not in the class path. A proper stack trace can pinpoint the problem. In the mean time, you can add it into your environment through HADOOP_CLASSPATH. (export HADOOP_CONF_DIR=/etc/hive/conf/) See more at http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_rn_spark_ki.html and look for "Spark not automatically picking up hive-site.xml". On Thursday, September 10, 2015 5:01 AM, Cheng Lianwrote: If you don't need to interact with Hive, you may compile Spark without using the -Phive flag to eliminate Hive dependencies. In this way, the sqlContext instance in Spark shell will be of type SQLContext instead of HiveContext. The reason behind the Hive metastore error is probably due to Hive misconfiguration. Cheng On 9/10/15 6:02 PM, Petr Novak wrote: > Hello, > > sqlContext.parquetFile(dir) > > throws exception " Unable to instantiate > org.apache.hadoop.hive.metastore.HiveMetaStoreClient" > > The strange thing is that on the second attempt to open the file it is > successful: > > try { > sqlContext.parquetFile(dir) > } catch { > case e: Exception => sqlContext.parquetFile(dir) > } > > What should I do to make my script to run flawlessly in spark-shell > when opening parquetFiles. It is probably missing some dependency. Or > how should I write the code because this double attempt is awfull and > I don't need HiveMetaStoreClient, I just need to open parquet file. > > Many thanks for any idea, > Petr > > - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org