I'm checking the logs in YARN and I found this error as well Application application_1434976209271_15614 failed 2 times due to AM Container for appattempt_1434976209271_15614_000002 exited with exitCode: 255
Diagnostics: Exception from container-launch. Container id: container_1434976209271_15614_02_000001 Exit code: 255 Stack trace: ExitCodeException exitCode=255: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:293) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Shell output: Requested user hdfs is not whitelisted and has id 496,which is below the minimum allowed 1000 Container exited with a non-zero exit code 255 Failing this attempt. Failing the application. 2015-06-27 11:25 GMT+02:00 Guillermo Ortiz <konstt2...@gmail.com>: > Well SPARK_CLASSPATH it's just a random name, the complete script is this: > > export HADOOP_CONF_DIR=/etc/hadoop/conf > > SPARK_CLASSPATH="file:/usr/metrics/conf/elasticSearch.properties,file:/usr/metrics/conf/redis.properties,/etc/spark/conf.cloudera.spark_on_yarn/yarn-conf/" > for lib in `ls /usr/metrics/lib/*.jar` > do > if [ -z "$SPARK_CLASSPATH" ]; then > SPARK_CLASSPATH=$lib > else > SPARK_CLASSPATH=$SPARK_CLASSPATH,$lib > fi > done > spark-submit --name "Metrics".... > > I need to add all the jars as you know,, maybe it was a bad name > SPARK_CLASSPATH > > The code doesn't have any stateful operation, yo I guess that it¡s okay > doesn't have checkpoint. I have executed hundres of times thiscode in VM > from Cloudera and never got this error. > > 2015-06-27 11:21 GMT+02:00 Tathagata Das <t...@databricks.com>: > >> 1. you need checkpointing mostly for recovering from driver failures, and >> in some cases also for some stateful operations. >> >> 2. Could you try not using the SPARK_CLASSPATH environment variable. >> >> TD >> >> On Sat, Jun 27, 2015 at 1:00 AM, Guillermo Ortiz <konstt2...@gmail.com> >> wrote: >> >>> I don't have any checkpoint on my code. Really, I don't have to save any >>> state. It's just a log processing of a PoC. >>> I have been testing the code in a VM from Cloudera and I never got that >>> error.. Not it's a real cluster. >>> >>> The command to execute Spark >>> spark-submit --name "PoC Logs" --master yarn-client --class >>> com.metrics.MetricsSpark --jars $SPARK_CLASSPATH --executor-memory 1g >>> /usr/metrics/ex/metrics-spark.jar $1 $2 $3 >>> >>> val sparkConf = new SparkConf() >>> val ssc = new StreamingContext(sparkConf, Seconds(5)) >>> val kafkaParams = Map[String, String]("metadata.broker.list" -> >>> args(0)) >>> val topics = args(1).split("\\,") >>> val directKafkaStream = KafkaUtils.createDirectStream[String, >>> String, StringDecoder, StringDecoder](ssc, kafkaParams, topics.toSet) >>> >>> directKafkaStream.foreachRDD { rdd => >>> val offsets = rdd.asInstanceOf[HasOffsetRanges].offsetRanges >>> val documents = rdd.mapPartitionsWithIndex { (i, kafkaEvent) => >>> ..... >>> } >>> >>> I understand that I just need a checkpoint if I need to recover the task >>> it something goes wrong, right? >>> >>> >>> 2015-06-27 9:39 GMT+02:00 Tathagata Das <t...@databricks.com>: >>> >>>> How are you trying to execute the code again? From checkpoints, or >>>> otherwise? >>>> Also cc'ed Hari who may have a better idea of YARN related issues. >>>> >>>> On Sat, Jun 27, 2015 at 12:35 AM, Guillermo Ortiz <konstt2...@gmail.com >>>> > wrote: >>>> >>>>> Hi, >>>>> >>>>> I'm executing a SparkStreamig code with Kafka. IçThe code was working >>>>> but today I tried to execute the code again and I got an exception, I dn't >>>>> know what's it happening. right now , there are no jobs executions on >>>>> YARN. >>>>> How could it fix it? >>>>> >>>>> Exception in thread "main" org.apache.spark.SparkException: Yarn >>>>> application has already ended! It might have been killed or unable to >>>>> launch application master. >>>>> at >>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113) >>>>> at >>>>> org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59) >>>>> at >>>>> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) >>>>> at org.apache.spark.SparkContext.<init>(SparkContext.scala:379) >>>>> at >>>>> org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:642) >>>>> at >>>>> org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:75) >>>>> at >>>>> com.produban.metrics.MetricsTransfInternationalSpark$.main(MetricsTransfInternationalSpark.scala:66) >>>>> at >>>>> com.produban.metrics.MetricsTransfInternationalSpark.main(MetricsTransfInternationalSpark.scala) >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.lang.reflect.Method.invoke(Method.java:606) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189) >>>>> at >>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110) >>>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >>>>> *15/06/27 09:27:09 ERROR Utils: Uncaught exception in thread delete >>>>> Spark local dirs* >>>>> java.lang.NullPointerException >>>>> at org.apache.spark.storage.DiskBlockManager.org >>>>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) >>>>> at >>>>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139) >>>>> Exception in thread "delete Spark local dirs" >>>>> java.lang.NullPointerException >>>>> at org.apache.spark.storage.DiskBlockManager.org >>>>> $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) >>>>> at >>>>> org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617) >>>>> at >>>>> org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139) >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >