Re: ZeppelinContext Not Found in yarn-cluster Mode

2018-07-17 Thread Jongyoul Lee
I have the same issue. We might need to investigate it deeply. Could you
please file it up?

Regards,
JL

On Tue, Jul 17, 2018 at 7:27 PM, Chris Penny  wrote:

> Hi all,
>
> Thanks for the 0.8.0 release!
>
> We’re keen to take advantage of the yarn-cluster support to take the
> pressure off our Zeppelin host. However, I am having some trouble with it.
> The first problem was in following the documentation here:
> https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html
>
> This suggests that we need to modify the master configuration from
> “yarn-client” to “yarn-cluster”. However, doing so results in the following
> error:
>
> Warning: Master yarn-cluster is deprecated since 2.0. Please use master
> “yarn” with specified deploy mode instead.
> Error: Client deploy mode is not compatible with master “yarn-cluster”
> Run with --help for usage help or --verbose for debug output
> 
>
> I got past this error with the following settings:
> master = yarn
> spark.submit.deployMode = cluster
>
> I’m somewhat unclear if I’m straying from the correct (documented)
> configuration or if the documentation needs an update. Anyway;
>
> These settings appear to work for everything except the ZeppelinContext,
> which is missing.
> Code:
> %spark
> z
>
> Output:
> :24: error: not found: value z
>
> Using yarn-client mode I can identify that z is meant to be an instance of
> org.apache.zeppelin.spark.SparkZeppelinContext
> Code:
> %spark
> z
>
> Output:
> res4: org.apache.zeppelin.spark.SparkZeppelinContext =
> org.apache.zeppelin.spark.SparkZeppelinContext@5b9282e1
>
> However, this class is absent in cluster-mode:
> Code:
> %spark
> org.apache.zeppelin.spark.SparkZeppelinContext
>
> Output:
> :24: error: object zeppelin is not a member of package org.apache
>org.apache.zeppelin.spark.SparkZeppelinContext
>   ^
>
> Snooping around the Zeppelin installation I was able to locate this class
> in ${ZEPPELIN_INSTALL_DIR}/interpreter/spark/spark-interpreter-0.8.0.jar.
> I then uploaded this jar to HDFS and added it to spark.jars &
> spark.driver.extraClassPath. Relevant entries in driver log:
>
> …
> Added JAR hdfs:/spark-interpreter-0.8.0.jar at
> hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar with timestamp
> 1531732774379
> …
> CLASSPATH -> …:hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar …
> …
> command:
> …
> file:$PWD/spark-interpreter-0.8.0.jar \
> etc.
>
> However, I still can’t use the ZeppelinContext or
> org.apache.zeppelin.spark.SparkZeppelinContext class. At this point I’ve
> run out of ideas and am willing to ask for help.
>
> Does anyone have thoughts on how I could use the ZeppelinContext in yarn
> cluster mode?
>
> Regards, Chris.
>
>


-- 
이종열, Jongyoul Lee, 李宗烈
http://madeng.net


ZeppelinContext Not Found in yarn-cluster Mode

2018-07-17 Thread Chris Penny
Hi all,

Thanks for the 0.8.0 release!

We’re keen to take advantage of the yarn-cluster support to take the
pressure off our Zeppelin host. However, I am having some trouble with it.
The first problem was in following the documentation here:
https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html

This suggests that we need to modify the master configuration from
“yarn-client” to “yarn-cluster”. However, doing so results in the following
error:

Warning: Master yarn-cluster is deprecated since 2.0. Please use master
“yarn” with specified deploy mode instead.
Error: Client deploy mode is not compatible with master “yarn-cluster”
Run with --help for usage help or --verbose for debug output


I got past this error with the following settings:
master = yarn
spark.submit.deployMode = cluster

I’m somewhat unclear if I’m straying from the correct (documented)
configuration or if the documentation needs an update. Anyway;

These settings appear to work for everything except the ZeppelinContext,
which is missing.
Code:
%spark
z

Output:
:24: error: not found: value z

Using yarn-client mode I can identify that z is meant to be an instance of
org.apache.zeppelin.spark.SparkZeppelinContext
Code:
%spark
z

Output:
res4: org.apache.zeppelin.spark.SparkZeppelinContext =
org.apache.zeppelin.spark.SparkZeppelinContext@5b9282e1

However, this class is absent in cluster-mode:
Code:
%spark
org.apache.zeppelin.spark.SparkZeppelinContext

Output:
:24: error: object zeppelin is not a member of package org.apache
   org.apache.zeppelin.spark.SparkZeppelinContext
  ^

Snooping around the Zeppelin installation I was able to locate this class
in ${ZEPPELIN_INSTALL_DIR}/interpreter/spark/spark-interpreter-0.8.0.jar. I
then uploaded this jar to HDFS and added it to spark.jars &
spark.driver.extraClassPath. Relevant entries in driver log:

…
Added JAR hdfs:/spark-interpreter-0.8.0.jar at
hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar
with timestamp 1531732774379
…
CLASSPATH -> …:hdfs:/tmp/zeppelin/spark-interpreter-0.8.0.jar …
…
command:
…
file:$PWD/spark-interpreter-0.8.0.jar \
etc.

However, I still can’t use the ZeppelinContext or
org.apache.zeppelin.spark.SparkZeppelinContext
class. At this point I’ve run out of ideas and am willing to ask for help.

Does anyone have thoughts on how I could use the ZeppelinContext in yarn
cluster mode?

Regards, Chris.