Re: Anybody has Zeppelin working against AWS EMR 4.0?

Francis Lau Wed, 29 Jul 2015 12:00:06 -0700

I ran into this next issue now. I ran a very simple Python command - print
date time, and I got the following error "org.apache.spark.SparkException:
Yarn application has already ended!"

Has anyone seen this error before? I have not done any additional
configuration Zeppelin, am I missing something in the configs?

Francis

*Command*
%pyspark
    import datetime
    print "Start Time: " + str(datetime.datetime.now())

*Error*
org.apache.spark.SparkException: Yarn application has already ended! It
might have been killed or unable to launch application master.
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113)
at
org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59)
at
org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:381)
at
org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:301)
at
org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at
org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:423)
at
org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at
org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:353)
at
org.apache.zeppelin.spark.PySparkInterpreter.getJavaSparkContext(PySparkInterpreter.java:374)
at
org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:140)
at
org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68)
at
org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92)
at
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276)
at org.apache.zeppelin.scheduler.Job.run(Job.java:170)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

On Wed, Jul 29, 2015 at 11:54 AM, Francis Lau <francis....@smartsheet.com>
wrote:

> Thanks Ranjit and Alexander,
>
> I added 8081 to my tunnel script and now it is connected. I will try to
> execute pyspark commands next.
>
> Just to offer a little value back to the future newbies like me, here is
> my bash script that connects all the UI ports for EMR, Spark, iPython
> Notebook and Zeppelin.  I assume that this email thread will get archived
> in a Google searchable location. These ports works for EMR release 4.0 with
> Spark and others installed. Zeppelin and iPython Notebook were installed
> via custom bootstrap scripts.
>
> Francis
>
> # -------------------------------
> # TunnelSpark.sh
> # -------------------------------
>
> # This script is called with a single argument for IP address of the
> # EMR master node (which has Spark driver, Zeppelin, iPython Notebook, Hue
> on it as well)
>
> # For list of AWS EMR ports, see:
> #
> https://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html#d0e708
>
> echo 'Tunneling to iPython Notebook (port 8192)...'
> echo
> echo 'Tunneling to Spark UI (port 18080)...'
> echo 'Tunneling to Spark UI (port 4040,4041,4042)...'
> echo
> echo 'Tunneling to Hadoop Resource Manager (port 8088)...'
> echo 'Tunneling to Hadoop Node Manager (port 8042)...'
> echo
> echo 'Tunneling to Hue (port 8888)...'
> echo
> echo 'Tunneling to Zeppelin (port 8080,8081)...'
>
> ssh -o ServerAliveInterval=10 -i ~/.ssh/POCMasterKey.pem -N \
> -L 8192:ec2-$1.compute-1.amazonaws.com:8192 \
> -L 18080:ec2-$1.compute-1.amazonaws.com:18080 \
> -L 4040:ec2-$1.compute-1.amazonaws.com:4040 \
> -L 4041:ec2-$1.compute-1.amazonaws.com:4041 \
> -L 4042:ec2-$1.compute-1.amazonaws.com:4042 \
> -L 8088:ec2-$1.compute-1.amazonaws.com:8088 \
> -L 8042:ec2-$1.compute-1.amazonaws.com:8042 \
> -L 8888:ec2-$1.compute-1.amazonaws.com:8888 \
> -L 8080:ec2-$1.compute-1.amazonaws.com:8080 \
> -L 8081:ec2-$1.compute-1.amazonaws.com:8081 \
> hadoop@ec2-$1.compute-1.amazonaws.com
>
>
>
> On Tue, Jul 28, 2015 at 9:34 PM, Ranjit Manuel <ranjit.f.man...@gmail.com>
> wrote:
>
>> Couple of things to check
>>
>> 1. Websocket port is available
>> 2. Check logs for any errors
>> 3.  Web browser you are using..this happened with me and found that it
>> works only with Mozilla Firefox
>> On Jul 29, 2015 4:31 AM, "Francis Lau" <francis....@smartsheet.com>
>> wrote:
>>
>>> Anyone has Zeppelin working against AWS EMR 4.0 with Spark?
>>>
>>> The 4.0 version of EMR was just released last week:
>>> http://aws.amazon.com/about-aws/whats-new/2015/07/amazon-emr-release-4-0-0-with-new-versions-of-apache-hadoop-hive-and-spark-now-available/
>>>
>>> I found this bootstrap and I got a new cluster up and running without
>>> errors:
>>> https://gist.github.com/andershammar/224e1077021d0ea376dd#comments
>>>
>>> But the Zepp UI shows the "disconnected" red label and I also cannot
>>> create a new notebook.
>>>
>>> I am very new to Zeppelin so it may be a rookie issue :) i.e. configs or
>>> connections.
>>>
>>> Help?
>>>
>>> --
>>> *Francis *
>>>
>>
>
>
> --
> *Francis Lau* | *Smartsheet*
> Senior Director of Product Intelligence
> *c* 425-830-3889 (call/text)
> francis....@smartsheet.com <jason.terav...@smartsheet.com>
>

-- 
*Francis Lau* | *Smartsheet*
Senior Director of Product Intelligence
*c* 425-830-3889 (call/text)
francis....@smartsheet.com <jason.terav...@smartsheet.com>

Re: Anybody has Zeppelin working against AWS EMR 4.0?

Reply via email to