I can't reproduce this in %spark, nor %sql

It seems to be %pyspark-specific.

Also seems it runs fine first time I start Zeppelin, then it shows this
error
You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt
assembly


sqlc = HiveContext(sc)
sqlc.sql("select count(*) from hivedb.someTable")

It runs fine only one time, then

You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt
> assembly
> Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-8000586427786928449.py", line 267, in <module>
> raise Exception(traceback.format_exc())
> Exception: Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-8000586427786928449.py", line 265, in <module>
> exec(code)
> File "<stdin>", line 2, in <module>
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
> line 580, in sql
> return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
> line 683, in _ssql_ctx
> self._scala_HiveContext = self._get_hive_ctx()
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
> line 692, in _get_hive_ctx
> return self._jvm.HiveContext(self._jsc.sc())
> File
> "/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
> line 1064, in __call__
> answer, self._gateway_client, None, self._fqn)
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/utils.py",
> line 45, in deco
> return f(*a, **kw)



I don't see more details in logs than above error stack.


-- 
Ruslan Dautkhanov

On Wed, Nov 23, 2016 at 7:02 AM, Felix Cheung <felixcheun...@hotmail.com>
wrote:

> Hmm, SPARK_HOME is set it should pick up the right Spark.
>
> Does this work with the Scala Spark interpreter instead of pyspark? If it
> doesn't, is there more info in the log?
>
>
> ------------------------------
> *From:* Ruslan Dautkhanov <dautkha...@gmail.com>
> *Sent:* Monday, November 21, 2016 1:52:36 PM
> *To:* users@zeppelin.apache.org
> *Subject:* "You must build Spark with Hive. Export 'SPARK_HIVE=true'"
>
> Getting
> You must *build Spark with Hive*. Export 'SPARK_HIVE=true'
> See full stack [2] below.
>
> I'm using Spark 1.6 that comes with CDH 5.8.3.
> So it's definitely compiled with Hive.
> We use Jupyter notebooks without problems in the same environment.
>
> Using Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from
> apache.org
>
> Is Zeppelin compiled with Hive too? I guess so.
> Not sure what else is missing.
>
> Tried to play with ZEPPELIN_SPARK_USEHIVECONTEXT but it does not make
> difference.
>
>
> [1]
> $ cat zeppelin-env.sh
> export JAVA_HOME=/usr/java/java7
> export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
> export SPARK_SUBMIT_OPTIONS="--principal xxxx --keytab yyy --conf
> spark.driver.memory=7g --conf spark.executor.cores=2 --conf
> spark.executor.memory=8g"
> export SPARK_APP_NAME="Zeppelin notebook"
> export HADOOP_CONF_DIR=/etc/hadoop/conf
> export HIVE_CONF_DIR=/etc/hive/conf
> export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
> export PYSPARK_PYTHON="/opt/cloudera/parcels/Anaconda/bin/python2"
> export PYTHONPATH="/opt/cloudera/parcels/CDH/lib/spark/python:/
> opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip"
> export MASTER="yarn-client"
> export ZEPPELIN_SPARK_USEHIVECONTEXT=true
>
>
>
>
> [2]
>
> You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt
> assembly
> Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 267, in <module>
> raise Exception(traceback.format_exc())
> Exception: Traceback (most recent call last):
> File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 265, in <module>
> exec(code)
> File "<stdin>", line 9, in <module>
> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
> line 580, in sql
>
> [3]
> Also have correct symlinks in zeppelin_home/conf for
> - hive-site.xml
> - hdfs-site.xml
> - core-site.xml
> - yarn-site.xml
>
>
>
> Thank you,
> Ruslan Dautkhanov
>

Reply via email to