Spark-shell throws Hive error when SQLContext.parquetFile, v1.3

2015-09-10 Thread Petr Novak
Hello,

sqlContext.parquetFile(dir)

throws exception " Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient"

The strange thing is that on the second attempt to open the file it is
successful:

try {
sqlContext.parquetFile(dir)
  } catch {
case e: Exception => sqlContext.parquetFile(dir)
}

What should I do to make my script to run flawlessly in spark-shell when
opening parquetFiles. It is probably missing some dependency. Or how should
I write the code because this double attempt is awfull and I don't need
HiveMetaStoreClient, I just need to open parquet file.

Many thanks for any idea,
Petr


Re: Spark-shell throws Hive error when SQLContext.parquetFile, v1.3

2015-09-10 Thread Cheng Lian
If you don't need to interact with Hive, you may compile Spark without 
using the -Phive flag to eliminate Hive dependencies. In this way, the 
sqlContext instance in Spark shell will be of type SQLContext instead of 
HiveContext.


The reason behind the Hive metastore error is probably due to Hive 
misconfiguration.


Cheng

On 9/10/15 6:02 PM, Petr Novak wrote:

Hello,

sqlContext.parquetFile(dir)

throws exception " Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient"


The strange thing is that on the second attempt to open the file it is 
successful:


try {
sqlContext.parquetFile(dir)
  } catch {
case e: Exception => sqlContext.parquetFile(dir)
}

What should I do to make my script to run flawlessly in spark-shell 
when opening parquetFiles. It is probably missing some dependency. Or 
how should I write the code because this double attempt is awfull and 
I don't need HiveMetaStoreClient, I just need to open parquet file.


Many thanks for any idea,
Petr





-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark-shell throws Hive error when SQLContext.parquetFile, v1.3

2015-09-10 Thread Mohammad Islam
In addition to Cheng's comment --
I found the similar problem when hive-site.xml is not in the class path. A 
proper stack trace can pinpoint the problem.

In the mean time, you can add it into your environment through 
HADOOP_CLASSPATH. (export HADOOP_CONF_DIR=/etc/hive/conf/)  
See more at 
http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_rn_spark_ki.html
and look for "Spark not automatically picking up hive-site.xml".


On Thursday, September 10, 2015 5:01 AM, Cheng Lian  
wrote:



If you don't need to interact with Hive, you may compile Spark without 
using the -Phive flag to eliminate Hive dependencies. In this way, the 
sqlContext instance in Spark shell will be of type SQLContext instead of 
HiveContext.

The reason behind the Hive metastore error is probably due to Hive 
misconfiguration.

Cheng


On 9/10/15 6:02 PM, Petr Novak wrote:
> Hello,
>
> sqlContext.parquetFile(dir)
>
> throws exception " Unable to instantiate 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient"
>
> The strange thing is that on the second attempt to open the file it is 
> successful:
>
> try {
> sqlContext.parquetFile(dir)
>   } catch {
> case e: Exception => sqlContext.parquetFile(dir)
> }
>
> What should I do to make my script to run flawlessly in spark-shell 
> when opening parquetFiles. It is probably missing some dependency. Or 
> how should I write the code because this double attempt is awfull and 
> I don't need HiveMetaStoreClient, I just need to open parquet file.
>
> Many thanks for any idea,
> Petr
>
>


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org