Re: Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Sourav Mazumder Tue, 20 Oct 2015 06:34:30 -0700

Hi Moon,

Sorry for late reply. Missed out this one.


If you are running in yarn-cluster mode ZeppelinServer does not need to
access nodes in yarn cluster. That is the whole purpose of yarn-cluster
mode option of Spark in my understanding.

Right now you can achieve the same thing even when using SparkStandalone or
SparkOnMesos.

Regards,
Sourav


On Tue, Oct 13, 2015 at 1:47 AM, moon soo Lee <m...@apache.org> wrote:

> Thanks for sharing your use case.
>
> Then, let's say Zeppelin runs SparkInterpreter process using spark-submit
> with yarn-cluster mode without error. SparkInterpreter is then runs inside
> an application master process which is managed by YARN on the cluster. and
> ZeppelinServer can get host and port somehow and connect to the
> SparkInterpreter process using thrift protocol.
>
> But, that means ZeppelinServer still need to access node in yarn cluster
> to connect to SparkInterpreter process that runs in application master.
>
> Would this okay for your case?
>
> And i'm also curious how other people handle the situation, ie. case the
> spark drivers need to have access to all data ndoes/slaves nodes.
>
> Thanks,
> moon
>
>
> On Mon, Oct 12, 2015 at 12:31 AM Sourav Mazumder <
> sourav.mazumde...@gmail.com> wrote:
>
>> Moon,
>>
>> It is to support an architecture where Zeppeline does not need to run in
>> the same machine/cluster where spark/hadoop is running.
>>
>> Right now it us not possible to achieve the same as in yarn-client mode
>> as in that case the spark drivers needs to have access to all data
>> nodes/slave nodes.
>>
>> One can achieve the same having a remote spark stand alone cluster. But
>> in that case I cannot use yarn to address the workload management.
>>
>> Regards,
>> Souravu
>>
>> On Oct 11, 2015, at 12:25 PM, moon soo Lee <m...@apache.org> wrote:
>>
>> My apologies, i missed the most important part of the question.
>> Yarn-cluster mode. Zeppelin is not expected to work with yarn-cluster mode
>> at the moment.
>>
>> Is there any special reason you need to use yarn-cluster mode instead of
>> yarn-client mode?
>>
>> Thanks,
>> moon
>>
>> On 2015년 10월 11일 (일) at 오후 8:41 Sourav Mazumder <
>> sourav.mazumde...@gmail.com> wrote:
>>
>>> Hi Moon,
>>>
>>> Yes I have checked the same.
>>>
>>> I have put some debug statement in the interpreter.sh to see what
>>> exactly is getting passed when I set the SPARK_HOME in zeppelin-env.sh.
>>>
>>> The debug statement does show that it is using the spark-submit utility
>>> from the bin folder of the SPARK_HOME which I have set in zeppelin-env.sh.
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Sun, Oct 11, 2015 at 2:55 AM, moon soo Lee <m...@apache.org> wrote:
>>>
>>>> Could you make sure your zeppelin-env.sh have SPARK_HOME exported?
>>>>
>>>> Zeppelin(0.6.0-SNAPSHOT) uses spark-submit command when SPARK_HOME is
>>>> defined, but your error shows that "please use spark-submit".
>>>>
>>>> Thanks,
>>>> moon
>>>> On 2015년 10월 8일 (목) at 오후 9:14 Sourav Mazumder <
>>>> sourav.mazumde...@gmail.com> wrote:
>>>>
>>>>> Hi Deepak/Moon,
>>>>>
>>>>> After seeing the stack trace of the error and the code
>>>>> org.apache.zeppelin.spark.SparkInterpreter.java I think this is surely a
>>>>> bug in Spark Interpreter code.
>>>>>
>>>>> The SparkInterpreter code is always calling the constructor of
>>>>> org.apache.spark.SparkContext to create a new Spark Context whenever the
>>>>> SparkInterpreter class is loaded by
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer. And hence
>>>>> this error.
>>>>>
>>>>> I'm not sure whether the check for yarn-cluster is newly added in
>>>>> SparkContext.
>>>>>
>>>>> Attaching here the complete stack trace for your ease of reference.
>>>>>
>>>>> Regards,
>>>>> Sourav
>>>>>
>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't
>>>>> running on a cluster. Deployment to YARN is not supported directly by
>>>>> SparkContext. Please use spark-submit. at
>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>> at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>> at
>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>> at
>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at
>>>>> java.util.concurrent.FutureTask.run(Unknown Source) at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(Unknown
>>>>> Source) at
>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
>>>>> Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown
>>>>> Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
>>>>> Source) at java.lang.Thread.run(Unknown Source)
>>>>>
>>>>> On Mon, Oct 5, 2015 at 12:57 PM, Sourav Mazumder <
>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>
>>>>>> I could execute following without any issue.
>>>>>>
>>>>>> spark-submit --class org.apache.spark.examples.SparkPi --master
>>>>>> yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 
>>>>>> 512m
>>>>>> --executor-cores 1 lib/spark-examples.jar 10
>>>>>>
>>>>>> Regards,
>>>>>> Sourav
>>>>>>
>>>>>> On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> did you try a test job with yarn-cluster (outside zeppelin) ?
>>>>>>>
>>>>>>> On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <
>>>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Yes I have them setup appropriately.
>>>>>>>>
>>>>>>>> Where I am lost is I can see that interpreter is running
>>>>>>>> spark-submit but at some point of time it is switching to creating a 
>>>>>>>> spark
>>>>>>>> context.
>>>>>>>>
>>>>>>>> May be, as you rightly mentioned, because of some permission issue
>>>>>>>> it is not able to run driver on yarn cluster. But what is that
>>>>>>>> issue/required configuration I'm not able to figure out.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Sourav
>>>>>>>>
>>>>>>>> On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Do you have these settings configured in zeppelin-env.sh
>>>>>>>>>
>>>>>>>>> export JAVA_HOME=/usr/src/jdk1.7.0_79/
>>>>>>>>>
>>>>>>>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>>>>>>>>
>>>>>>>>> Most likely you have this as your able to run with yarn-client.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Looks like the issue is to not be able to run the driver program
>>>>>>>>> on cluster.
>>>>>>>>>
>>>>>>>>> On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <
>>>>>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yes. Spark is installed in the machine where zeppelin is running.
>>>>>>>>>>
>>>>>>>>>> The location of spark.yarn.jar is very similar to what you have.
>>>>>>>>>> I'm using IOP as distribution and it is the directory naming 
>>>>>>>>>> convention
>>>>>>>>>> specific to IOP which is different form hdp.
>>>>>>>>>>
>>>>>>>>>> And yes the setup works perfectly fine when I use master as
>>>>>>>>>> yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and
>>>>>>>>>> HADOOP_CLIENT>
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Sourav
>>>>>>>>>>
>>>>>>>>>> On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <
>>>>>>>>>> deepuj...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Is spark installed on your zeppelin machine ?
>>>>>>>>>>>
>>>>>>>>>>> I would to try these
>>>>>>>>>>>
>>>>>>>>>>> master yarn-client
>>>>>>>>>>> spark.home === SPARK INSTALLATION HOME directory on your
>>>>>>>>>>> zeppelin server.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Looking at  spark.yarn.jar , i see spark is installed at
>>>>>>>>>>> /usr/iop/current/spark-thriftserver/  . But why is it
>>>>>>>>>>> thirftserver (i do not know what is it).
>>>>>>>>>>>
>>>>>>>>>>> I have spark installed (unzip) on zeppelin machine at 
>>>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/
>>>>>>>>>>>  (can be any location) and have spark.yarn.jar to
>>>>>>>>>>> /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <
>>>>>>>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Deepu,
>>>>>>>>>>>>
>>>>>>>>>>>> Here u go.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Sourav
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> *Properties* name value args master yarn-cluster spark.app.name 
>>>>>>>>>>>> Zeppelin
>>>>>>>>>>>> spark.cores.max spark.executor.memory 512m spark.home 
>>>>>>>>>>>> spark.yarn.jar
>>>>>>>>>>>> /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar 
>>>>>>>>>>>> zeppelin.dep.localrepo
>>>>>>>>>>>> local-repo zeppelin.pyspark.python python 
>>>>>>>>>>>> zeppelin.spark.concurrentSQL
>>>>>>>>>>>> false zeppelin.spark.maxResult 1000 zeppelin.spark.useHiveContext
>>>>>>>>>>>> true
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <
>>>>>>>>>>>> deepuj...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Can you share screen shot of your spark interpreter on
>>>>>>>>>>>>> zeppelin web interface.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have exact same deployment structure and it runs fine with
>>>>>>>>>>>>> right set of configurations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <
>>>>>>>>>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Moon,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm using 0.6 SNAPSHOT which I built from latest git hub.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I tried setting SPARK_HOME in zeppelin-env.sh. Also I could
>>>>>>>>>>>>>> see that the control goes to the appropriate IF-ELSE block in
>>>>>>>>>>>>>> interpreter.sh by putting some debug statement.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But I get the same error as follows -
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> org.apache.spark.SparkException: Detected yarn-cluster mode,
>>>>>>>>>>>>>> but isn't running on a cluster. Deployment to YARN is not 
>>>>>>>>>>>>>> supported
>>>>>>>>>>>>>> directly by SparkContext. Please use spark-submit. at
>>>>>>>>>>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
>>>>>>>>>>>>>> at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at
>>>>>>>>>>>>>> org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
>>>>>>>>>>>>>> at 
>>>>>>>>>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>>>>>>>>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:745)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know if you need any other details to figure out what
>>>>>>>>>>>>>> is going on.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>> Sourav
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <
>>>>>>>>>>>>>> m...@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Which version of Zeppelin are you using?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Master branch uses spark-submit command, when SPARK_HOME is
>>>>>>>>>>>>>>> defined in conf/zeppelin-env.sh
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you're not on master branch, recommend try it with
>>>>>>>>>>>>>>> SPARK_HOME defined.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hope this helps,
>>>>>>>>>>>>>>> moon
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <
>>>>>>>>>>>>>>> sourav.mazumde...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When I try to run Spark Interpreter in Yarn Cluster mode
>>>>>>>>>>>>>>>> from a remote machine I always get the error saying try 
>>>>>>>>>>>>>>>> spark-submit than
>>>>>>>>>>>>>>>> using spark context.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Mu Zeppelin process runs in a separate machine remote to
>>>>>>>>>>>>>>>> the YARN cluster.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Any idea why is this error ?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>> Sourav
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Deepak
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Deepak
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Deepak
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Deepak
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>

Re: Zeppelin fails to submit Spark job in Yarn Cluster mode - a Bug ?

Reply via email to