How to run spark streaming application on YARN?
Hi, I've been running my spark streaming application in standalone mode without any worries. Now, I've been trying to run it on YARN (hadoop 2.7.0) but I am having some problems. Here are the config parameters of my application: « val sparkConf = new SparkConf() sparkConf.setMaster("yarn-client") sparkConf.set("spark.yarn.am.memory", "2g") sparkConf.set("spark.executor.instances", "2") sparkConf.setAppName("Benchmark") sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) sparkConf.set("spark.executor.memory", "4g") sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " + "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ") if (sparkConf.getOption("spark.master") == None) { sparkConf.setMaster("local[*]") } » The jar I'm including there only contains the application classes. Here is the log of the application: http://pastebin.com/7RSktezA Here is the userlog on hadoop/YARN: « Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 14 more » I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error persists. Am I doing something wrong? Thanks.
Re: How to run spark streaming application on YARN?
Hi Saiph, Are you launching using spark-submit? -Sandy On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa wrote: > Hi, > > I've been running my spark streaming application in standalone mode > without any worries. Now, I've been trying to run it on YARN (hadoop 2.7.0) > but I am having some problems. > > Here are the config parameters of my application: > « > val sparkConf = new SparkConf() > > sparkConf.setMaster("yarn-client") > sparkConf.set("spark.yarn.am.memory", "2g") > sparkConf.set("spark.executor.instances", "2") > > sparkConf.setAppName("Benchmark") > > sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) > sparkConf.set("spark.executor.memory", "4g") > sparkConf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops > -XX:+UseConcMarkSweepGC " + > "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ") > if (sparkConf.getOption("spark.master") == None) { > sparkConf.setMaster("local[*]") > } > » > > The jar I'm including there only contains the application classes. > > > Here is the log of the application: http://pastebin.com/7RSktezA > > Here is the userlog on hadoop/YARN: > « > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/spark/Logging > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:800) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) > at java.net.URLClassLoader.access$100(URLClassLoader.java:71) > at java.net.URLClassLoader$1.run(URLClassLoader.java:361) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at > org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) > at > org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) > Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > ... 14 more > » > > I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error > persists. Am I doing something wrong? > > Thanks. >
Re: How to run spark streaming application on YARN?
No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought it was the same thing since I am passing all the configurations through the application code. Is that the problem? On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza wrote: > Hi Saiph, > > Are you launching using spark-submit? > > -Sandy > > On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa > wrote: > >> Hi, >> >> I've been running my spark streaming application in standalone mode >> without any worries. Now, I've been trying to run it on YARN (hadoop 2.7.0) >> but I am having some problems. >> >> Here are the config parameters of my application: >> « >> val sparkConf = new SparkConf() >> >> sparkConf.setMaster("yarn-client") >> sparkConf.set("spark.yarn.am.memory", "2g") >> sparkConf.set("spark.executor.instances", "2") >> >> sparkConf.setAppName("Benchmark") >> >> sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) >> sparkConf.set("spark.executor.memory", "4g") >> sparkConf.set("spark.serializer", >> "org.apache.spark.serializer.KryoSerializer") >> sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops >> -XX:+UseConcMarkSweepGC " + >> "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ") >> if (sparkConf.getOption("spark.master") == None) { >> sparkConf.setMaster("local[*]") >> } >> » >> >> The jar I'm including there only contains the application classes. >> >> >> Here is the log of the application: http://pastebin.com/7RSktezA >> >> Here is the userlog on hadoop/YARN: >> « >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/spark/Logging >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:800) >> at >> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) >> at java.net.URLClassLoader.access$100(URLClassLoader.java:71) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at >> org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) >> at >> org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) >> Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> ... 14 more >> » >> >> I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error >> persists. Am I doing something wrong? >> >> Thanks. >> > >
Re: How to run spark streaming application on YARN?
spark-submit is the recommended way of launching Spark applications on YARN, because it takes care of submitting the right jars as well as setting up the classpath and environment variables appropriately. -Sandy On Thu, Jun 4, 2015 at 10:30 AM, Saiph Kappa wrote: > No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought it > was the same thing since I am passing all the configurations through the > application code. Is that the problem? > > On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza > wrote: > >> Hi Saiph, >> >> Are you launching using spark-submit? >> >> -Sandy >> >> On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa >> wrote: >> >>> Hi, >>> >>> I've been running my spark streaming application in standalone mode >>> without any worries. Now, I've been trying to run it on YARN (hadoop 2.7.0) >>> but I am having some problems. >>> >>> Here are the config parameters of my application: >>> « >>> val sparkConf = new SparkConf() >>> >>> sparkConf.setMaster("yarn-client") >>> sparkConf.set("spark.yarn.am.memory", "2g") >>> sparkConf.set("spark.executor.instances", "2") >>> >>> sparkConf.setAppName("Benchmark") >>> >>> sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) >>> sparkConf.set("spark.executor.memory", "4g") >>> sparkConf.set("spark.serializer", >>> "org.apache.spark.serializer.KryoSerializer") >>> sparkConf.set("spark.executor.extraJavaOptions", " >>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " + >>> "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 >>> ") >>> if (sparkConf.getOption("spark.master") == None) { >>> sparkConf.setMaster("local[*]") >>> } >>> » >>> >>> The jar I'm including there only contains the application classes. >>> >>> >>> Here is the log of the application: http://pastebin.com/7RSktezA >>> >>> Here is the userlog on hadoop/YARN: >>> « >>> Exception in thread "main" java.lang.NoClassDefFoundError: >>> org/apache/spark/Logging >>> at java.lang.ClassLoader.defineClass1(Native Method) >>> at java.lang.ClassLoader.defineClass(ClassLoader.java:800) >>> at >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) >>> at java.net.URLClassLoader.access$100(URLClassLoader.java:71) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>> at >>> org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) >>> at >>> org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) >>> Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>> ... 14 more >>> » >>> >>> I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error >>> persists. Am I doing something wrong? >>> >>> Thanks. >>> >> >> >
Re: How to run spark streaming application on YARN?
Thanks! It is working fine now with spark-submit. Just out of curiosity, how would you use org.apache.spark.deploy.yarn.Client? Adding that spark_yarn jar to the configuration inside the application? On Thu, Jun 4, 2015 at 6:37 PM, Vova Shelgunov wrote: > You should run it with spark-submit or using org > .apache.spark.deploy.yarn.Client. > > 2015-06-04 20:30 GMT+03:00 Saiph Kappa : > >> No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought it >> was the same thing since I am passing all the configurations through the >> application code. Is that the problem? >> >> On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza >> wrote: >> >>> Hi Saiph, >>> >>> Are you launching using spark-submit? >>> >>> -Sandy >>> >>> On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa >>> wrote: >>> Hi, I've been running my spark streaming application in standalone mode without any worries. Now, I've been trying to run it on YARN (hadoop 2.7.0) but I am having some problems. Here are the config parameters of my application: « val sparkConf = new SparkConf() sparkConf.setMaster("yarn-client") sparkConf.set("spark.yarn.am.memory", "2g") sparkConf.set("spark.executor.instances", "2") sparkConf.setAppName("Benchmark") sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) sparkConf.set("spark.executor.memory", "4g") sparkConf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer") sparkConf.set("spark.executor.extraJavaOptions", " -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " + "-XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ") if (sparkConf.getOption("spark.master") == None) { sparkConf.setMaster("local[*]") } » The jar I'm including there only contains the application classes. Here is the log of the application: http://pastebin.com/7RSktezA Here is the userlog on hadoop/YARN: « Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/Logging at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:800) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) at java.net.URLClassLoader.access$100(URLClassLoader.java:71) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 14 more » I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error persists. Am I doing something wrong? Thanks. >>> >>> >> >
Re: How to run spark streaming application on YARN?
That might work, but there might also be other steps that are required. -Sandy On Thu, Jun 4, 2015 at 11:13 AM, Saiph Kappa wrote: > Thanks! It is working fine now with spark-submit. Just out of curiosity, > how would you use org.apache.spark.deploy.yarn.Client? Adding that > spark_yarn jar to the configuration inside the application? > > On Thu, Jun 4, 2015 at 6:37 PM, Vova Shelgunov wrote: > >> You should run it with spark-submit or using org >> .apache.spark.deploy.yarn.Client. >> >> 2015-06-04 20:30 GMT+03:00 Saiph Kappa : >> >>> No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought >>> it was the same thing since I am passing all the configurations through the >>> application code. Is that the problem? >>> >>> On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza >>> wrote: >>> Hi Saiph, Are you launching using spark-submit? -Sandy On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa wrote: > Hi, > > I've been running my spark streaming application in standalone mode > without any worries. Now, I've been trying to run it on YARN (hadoop > 2.7.0) > but I am having some problems. > > Here are the config parameters of my application: > « > val sparkConf = new SparkConf() > > sparkConf.setMaster("yarn-client") > sparkConf.set("spark.yarn.am.memory", "2g") > sparkConf.set("spark.executor.instances", "2") > > sparkConf.setAppName("Benchmark") > > sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) > sparkConf.set("spark.executor.memory", "4g") > sparkConf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > sparkConf.set("spark.executor.extraJavaOptions", " > -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " + > "-XX:+AggressiveOpts -XX:FreqInlineSize=300 > -XX:MaxInlineSize=300 ") > if (sparkConf.getOption("spark.master") == None) { > sparkConf.setMaster("local[*]") > } > » > > The jar I'm including there only contains the application classes. > > > Here is the log of the application: http://pastebin.com/7RSktezA > > Here is the userlog on hadoop/YARN: > « > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/spark/Logging > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:800) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) > at java.net.URLClassLoader.access$100(URLClassLoader.java:71) > at java.net.URLClassLoader$1.run(URLClassLoader.java:361) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at > org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) > at > org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) > Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > ... 14 more > » > > I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error > persists. Am I doing something wrong? > > Thanks. > >>> >> >
Re: How to run spark streaming application on YARN?
Additionally, I think this document ( https://spark.apache.org/docs/latest/building-spark.html ) should mention that the protobuf.version might need to be changed to match the one used in the chosen hadoop version. For instance, with hadoop 2.7.0 I had to change protobuf.version to 1.5.0 to be able to run my application. On Thu, Jun 4, 2015 at 7:14 PM, Sandy Ryza wrote: > That might work, but there might also be other steps that are required. > > -Sandy > > On Thu, Jun 4, 2015 at 11:13 AM, Saiph Kappa > wrote: > >> Thanks! It is working fine now with spark-submit. Just out of curiosity, >> how would you use org.apache.spark.deploy.yarn.Client? Adding that >> spark_yarn jar to the configuration inside the application? >> >> On Thu, Jun 4, 2015 at 6:37 PM, Vova Shelgunov wrote: >> >>> You should run it with spark-submit or using org >>> .apache.spark.deploy.yarn.Client. >>> >>> 2015-06-04 20:30 GMT+03:00 Saiph Kappa : >>> No, I am not. I run it with sbt «sbt "run-main Branchmark"». I thought it was the same thing since I am passing all the configurations through the application code. Is that the problem? On Thu, Jun 4, 2015 at 6:26 PM, Sandy Ryza wrote: > Hi Saiph, > > Are you launching using spark-submit? > > -Sandy > > On Thu, Jun 4, 2015 at 10:20 AM, Saiph Kappa > wrote: > >> Hi, >> >> I've been running my spark streaming application in standalone mode >> without any worries. Now, I've been trying to run it on YARN (hadoop >> 2.7.0) >> but I am having some problems. >> >> Here are the config parameters of my application: >> « >> val sparkConf = new SparkConf() >> >> sparkConf.setMaster("yarn-client") >> sparkConf.set("spark.yarn.am.memory", "2g") >> sparkConf.set("spark.executor.instances", "2") >> >> sparkConf.setAppName("Benchmark") >> >> sparkConf.setJars(Array("target/scala-2.10/benchmark-app_2.10-0.1-SNAPSHOT.jar")) >> sparkConf.set("spark.executor.memory", "4g") >> sparkConf.set("spark.serializer", >> "org.apache.spark.serializer.KryoSerializer") >> sparkConf.set("spark.executor.extraJavaOptions", " >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC " + >> "-XX:+AggressiveOpts -XX:FreqInlineSize=300 >> -XX:MaxInlineSize=300 ") >> if (sparkConf.getOption("spark.master") == None) { >> sparkConf.setMaster("local[*]") >> } >> » >> >> The jar I'm including there only contains the application classes. >> >> >> Here is the log of the application: http://pastebin.com/7RSktezA >> >> Here is the userlog on hadoop/YARN: >> « >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/spark/Logging >> at java.lang.ClassLoader.defineClass1(Native Method) >> at java.lang.ClassLoader.defineClass(ClassLoader.java:800) >> at >> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449) >> at java.net.URLClassLoader.access$100(URLClassLoader.java:71) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:361) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at >> org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:596) >> at >> org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) >> Caused by: java.lang.ClassNotFoundException: org.apache.spark.Logging >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> ... 14 more >> » >> >> I tried to add the spark core jar to ${HADOOP_HOME}/lib but the error >> persists. Am I doing something wrong? >> >> Thanks. >> > > >>> >> >
Re: How to run spark streaming application on YARN?
I was able to run my application by just using an hadoop/YARN cluster with 1 machine. Today I tried to extend the cluster to use one more machine, but I got some problems on the yarn node manager of that new added machine: Node Manager Log: « 2015-06-06 01:41:33,379 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Initializing user myuser 2015-06-06 01:41:33,382 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Copying from /tmp/hadoop-myuser/nm-local-dir/nmPrivate/container_1433549642381_0004_01_03.tokens to /tmp/hadoop-myuser/nm-local-dir/usercache/myuser/appcache/application_1433549642381_0004/container_1433549642381_0004_01_03.tokens 2015-06-06 01:41:33,382 INFO org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor: Localizer CWD set to /tmp/hadoop-myuser/nm-local-dir/usercache/myuser/appcache/application_1433549642381_0004 = file:/tmp/hadoop-myuser/nm-local-dir/usercache/myuser/appcache/application_1433549642381_0004 2015-06-06 01:41:33,405 WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.ResourceLocalizationService: { file:/home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar, 1433441011000, FILE, null } failed: Resource file:/home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar changed on src filesystem (expected 1433441011000, was 1433531913000 java.io.IOException: Resource file:/home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar changed on src filesystem (expected 1433441011000, was 1433531913000 at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:255) at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358) at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) 2015-06-06 01:41:33,405 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalizedResource: Resource file:/home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar(->/tmp/hadoop-myuser/nm-local-dir/usercache/myuser/filecache/15/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar) transitioned from DOWNLOADING to FAILED 2015-06-06 01:41:33,406 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl: Container container_1433549642381_0004_01_03 transitioned from LOCALIZING to LOCALIZATION_FAILED 2015-06-06 01:41:33,406 INFO org.apache.hadoop.yarn.server.nodemanager.containermanager.localizer.LocalResourcesTrackerImpl: Container container_1433549642381_0004_01_03 sent RELEASE event on a resource request { file:/home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar, 1433441011000, FILE, null } not present in cache. 2015-06-06 01:41:33,406 WARN org.apache.hadoop.ipc.Client: interrupted waiting to send rpc request to server » I have this jar on both machines: /home/myuser/my-spark/assembly/target/scala-2.10/spark-assembly-1.3.2-SNAPSHOT-hadoop2.7.0.jar However, I simply copied my-spark folder from machine1 to machine2, so that YARN could find the jar Any ideas of what can be wrong? Isn't this the correct way to share spark jars across YARN cluster? Thanks. On Thu, Jun 4, 2015 at 7:20 PM, Saiph Kappa wrote: > Additionally, I think this document ( > https://spark.apache.org/docs/latest/building-spark.html ) should mention > that the protobuf.version might need to be changed to match the one used in > the chosen hadoop version. For instance, with hadoop 2.7.0 I had to change > protobuf.version to 1.5.0 to be able to run my application. > > On Thu, Jun 4, 2015 at 7:14 PM, Sandy Ryza > wrote: > >> That might work, but there might also be other steps that are required. >> >> -Sandy >> >> On Thu, Jun 4, 2015 at 11:13 AM, Saiph Kappa >> wrote: >> >>> Thanks! It is working fine now with spark-submit. Just out of curiosity, >>> how would you use org.apache.spark.deploy.yarn.Client? Adding that >>> spark_yarn jar to the configuration inside the a