Can you reuse the HiveContext instead of making new ones with HiveContext(sc)?


________________________________
From: Ruslan Dautkhanov <dautkha...@gmail.com>
Sent: Sunday, November 27, 2016 8:07:41 AM
To: users
Subject: Re: "You must build Spark with Hive. Export 'SPARK_HIVE=true'"

Also, to get rid of this problem (once HiveContext(sc) was assigned at least 
twice to a variable,
the only fix is - ro restart Zeppelin :-(


--
Ruslan Dautkhanov

On Sun, Nov 27, 2016 at 9:00 AM, Ruslan Dautkhanov 
<dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote:
I found a pattern when this happens.

When I run
sqlCtx = HiveContext(sc)

it works as expected.

Second and any time after that - gives that exception stack I reported in this 
email chain.

> sqlCtx = HiveContext(sc)
> sqlCtx.sql('select * from marketview.spend_dim')

You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt 
assembly
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 267, in <module>
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 265, in <module>
exec(code)
File "<stdin>", line 2, in <module>
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
580, in sql
return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
683, in _ssql_ctx
self._scala_HiveContext = self._get_hive_ctx()
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
692, in _get_hive_ctx
return self._jvm.HiveContext(self._jsc.sc<http://jsc.sc>())
File 
"/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
 line 1064, in __call__
answer, self._gateway_client, None, self._fqn)
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/utils.py", line 
45, in deco
return f(*a, **kw)


Key piece to reproduce this issue - assign HiveContext(sc) to a variable more 
than once,
and use that variable between assignments.


--
Ruslan Dautkhanov

On Mon, Nov 21, 2016 at 2:52 PM, Ruslan Dautkhanov 
<dautkha...@gmail.com<mailto:dautkha...@gmail.com>> wrote:
Getting
You must build Spark with Hive. Export 'SPARK_HIVE=true'
See full stack [2] below.

I'm using Spark 1.6 that comes with CDH 5.8.3.
So it's definitely compiled with Hive.
We use Jupyter notebooks without problems in the same environment.

Using Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from 
apache.org<http://apache.org>

Is Zeppelin compiled with Hive too? I guess so.
Not sure what else is missing.

Tried to play with ZEPPELIN_SPARK_USEHIVECONTEXT but it does not make 
difference.


[1]
$ cat zeppelin-env.sh
export JAVA_HOME=/usr/java/java7
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export SPARK_SUBMIT_OPTIONS="--principal xxxx --keytab yyy --conf 
spark.driver.memory=7g --conf spark.executor.cores=2 --conf 
spark.executor.memory=8g"
export SPARK_APP_NAME="Zeppelin notebook"
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HIVE_CONF_DIR=/etc/hive/conf
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export PYSPARK_PYTHON="/opt/cloudera/parcels/Anaconda/bin/python2"
export 
PYTHONPATH="/opt/cloudera/parcels/CDH/lib/spark/python:/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip"
export MASTER="yarn-client"
export ZEPPELIN_SPARK_USEHIVECONTEXT=true




[2]

You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt 
assembly
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 267, in <module>
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 265, in <module>
exec(code)
File "<stdin>", line 9, in <module>
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
580, in sql

[3]
Also have correct symlinks in zeppelin_home/conf for
- hive-site.xml
- hdfs-site.xml
- core-site.xml
- yarn-site.xml



Thank you,
Ruslan Dautkhanov


Reply via email to