Re: How to run spark streaming application on YARN?

Saiph Kappa Thu, 04 Jun 2015 11:22:51 -0700

Additionally, I think this document (
https://spark.apache.org/docs/latest/building-spark.html ) should mention
that the protobuf.version might need to be changed to match the one used in
the chosen hadoop version. For instance, with hadoop 2.7.0 I had to change
protobuf.version to 1.5.0 to be able to run my application.


On Thu, Jun 4, 2015 at 7:14 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> That might work, but there might also be other steps that are required.
>
> -Sandy
>
> On Thu, Jun 4, 2015 at 11:13 AM, Saiph Kappa <saiph.ka...@gmail.com>
> wrote:
>
>> Thanks! It is working fine now with spark-submit. Just out of curiosity,
>> how would you use org.apache.spark.deploy.yarn.Client? Adding that
>> spark_yarn jar to the configuration inside the application?
>>
>> On Thu, Jun 4, 2015 at 6:37 PM, Vova Shelgunov <vvs...@gmail.com> wrote:
>>
>>> You should run it with spark-submit or using org
>>> .apache.spark.deploy.yarn.Client.
>>>
>>> 2015-06-04 20:30 GMT+03:00 Saiph Kappa <saiph.ka...@gmail.com>:
>>>
>>>> No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought
>>>> it was the same thing since I am passing all the configurations through the
>>>> application code. Is that the problem?
>>>>
>>>> On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza <sandy.r...@cloudera.com>
>>>> wrote:
>>>>
>>>>> Hi Saiph,
>>>>>
>>>>> Are you launching using spark-submit?
>>>>>
>>>>> -Sandy
>>>>>
>>>>> On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa <saiph.ka...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I've been running my spark streaming application in standalone mode
>>>>>> without any worries. Now, I've been trying to run it on YARN (hadoop 
>>>>>> 2.7.0)
>>>>>> but I am having some problems.
>>>>>>
>>>>>> Here are the config parameters of my application:
>>>>>> «
>>>>>> val sparkConf = new SparkConf()
>>>>>>
>>>>>> sparkConf.setMaster("yarn-client")
>>>>>> sparkConf.set("spark.yarn.am.memory", "2g")
>>>>>> sparkConf.set("spark.executor.instances", "2")
>>>>>>
>>>>>> sparkConf.setAppName("Benchmark")
>>>>>>
>>>>>> sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar"))
>>>>>> sparkConf.set("spark.executor.memory", "4g")
>>>>>> sparkConf.set("spark.serializer",
>>>>>> "org.apache.spark.serializer.KryoSerializer")
>>>>>> sparkConf.set("spark.executor.extraJavaOptions", "
>>>>>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " +
>>>>>>       "-XX:+AggressiveOpts -XX:FreqInlineSize=300
>>>>>> -XX:MaxInlineSize=300 ")
>>>>>> if (sparkConf.getOption("spark.master") == None) {
>>>>>>   sparkConf.setMaster("local[*]")
>>>>>> }
>>>>>> »
>>>>>>
>>>>>> The jar I'm including there only contains the application classes.
>>>>>>
>>>>>>
>>>>>> Here is the log of the application: http://pastebin.com/7RSktezA
>>>>>>
>>>>>> Here is the userlog on hadoop/YARN:
>>>>>> «
>>>>>> Exception in thread "main" java.lang.NoClassDefFoundError:
>>>>>> org/apache/spark/Logging
>>>>>>     at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>>     at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>>>>>>     at
>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>>>>>     at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>>>>>>     at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>>     at
>>>>>> org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596)
>>>>>>     at
>>>>>> org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
>>>>>> Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>>>>>>     at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>>     at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>>>>>     at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>>>>>     at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>>>>>     ... 14 more
>>>>>> »
>>>>>>
>>>>>> I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error
>>>>>> persists. Am I doing something wrong?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: How to run spark streaming application on YARN?

Reply via email to