[Apache Spark][Streaming Job][Checkpoint]Spark job failed on Checkpoint recovery with Batch not found error
Hi, We have a Spark 2.4 job failed on Checkpoint recovery every few hours with the following errors (from the Driver Log): driver spark-kubernetes-driver ERROR 20:38:51 ERROR MicroBatchExecution: Query impressionUpdate [id = 54614900-4145-4d60-8156-9746ffc13d1f, runId = 3637c2f3-49b6-40c2-b6d0-7edb28361c5d] terminated with error java.lang.IllegalStateException: batch 946 doesn't exist at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply$mcV$sp(MicroBatchExecution.scala:406) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1$$anonfun$apply$mcZ$sp$3.apply(MicroBatchExecution.scala:381) at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351) at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply$mcZ$sp(MicroBatchExecution.scala:381) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch$1.apply(MicroBatchExecution.scala:337) at org.apache.spark.sql.execution.streaming.MicroBatchExecution.withProgressLocked(MicroBatchExecution.scala:557) at org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$constructNextBatch(MicroBatchExecution.scala:337) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:183) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166) at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351) at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:166) at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56) at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160) at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:281) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193) And the executor logs show this error: ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM How should I fix this? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Spark Job Failed with FileNotFoundException
I have a spark cluster consists of 5 nodes and I have a spark job that should process some files from a directory and send its content to Kafka. I am trying to submit the job using the following command bin$ ./spark-submit --total-executor-cores 20 --executor-memory 5G --class org.css.java.FileMigration.FileSparkMigrator --master spark://spark-master:7077 /home/me/FileMigrator-0.1.1-jar-with-dependencies.jar /home/me/shared kafka01,kafka02,kafka03,kafka04,kafka05 kafka_topic The directory /home/me/shared is mounted on all the 5 nodes but when I submit the job I got the following exception java.io.FileNotFoundException: File file:/home/me/shared/input_1.txt does not exist at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:534) at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:747) at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:524) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:409) at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:140) at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:341) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:766) at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:108) at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67) at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:239) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:216) at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) After some tries, I faced another weird behavior. When I submit the job while the directory contains 1 or 2 files, the same exception is thrown on the driver machine but the job continued and the files are processed successfully. Once I add another file, the exception is thrown and the job failed. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Job-Failed-with-FileNotFoundException-tp27980.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Fwd: Spark job failed
-- Forwarded message -- From: Renu Yadav <yren...@gmail.com> Date: Mon, Sep 14, 2015 at 4:51 PM Subject: Spark job failed To: d...@spark.apache.org I am getting below error while running spark job: storage.DiskBlockObjectWriter: Uncaught exception while reverting partial writes to file /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd java.io.FileNotFoundException: /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd (No such file or directory I am running 1.3TB data following are the transformation read from hadoop->map(key/value).coalease(2000).groupByKey. then sorting each record by server_ts and select most recent saving data into parquet. Following is the command spark-submit --class com.test.Myapp--master yarn-cluster --driver-memory 16g --executor-memory 20g --executor-cores 5 --num-executors 150 --files /home/renu_yadav/fmyapp/hive-site.xml --conf spark.yarn.preserve.staging.files=true --conf spark.shuffle.memoryFraction=0.6 --conf spark.storage.memoryFraction=0.1 --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf spark.akka.timeout=40 --conf spark.locality.wait=10 --conf spark.yarn.executor.memoryOverhead=8000 --conf SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" --conf spark.reducer.maxMbInFlight=96 --conf spark.shuffle.file.buffer.kb=64 --conf spark.core.connection.ack.wait.timeout=120 --jars /usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar myapp_2.10-1.0.jar Cluster configuration 20 Nodes 32 cores per node 125 GB ram per node Please Help. Thanks & Regards, Renu Yadav
Re: Spark job failed
Have you considered posting on vendor forum ? FYI On Mon, Sep 14, 2015 at 6:09 AM, Renu Yadav <yren...@gmail.com> wrote: > > -- Forwarded message -- > From: Renu Yadav <yren...@gmail.com> > Date: Mon, Sep 14, 2015 at 4:51 PM > Subject: Spark job failed > To: d...@spark.apache.org > > > I am getting below error while running spark job: > > storage.DiskBlockObjectWriter: Uncaught exception while reverting partial > writes to file > /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd > java.io.FileNotFoundException: > /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd > (No such file or directory > > > > I am running 1.3TB data > following are the transformation > > read from hadoop->map(key/value).coalease(2000).groupByKey. > then sorting each record by server_ts and select most recent > > saving data into parquet. > > > Following is the command > spark-submit --class com.test.Myapp--master yarn-cluster --driver-memory > 16g --executor-memory 20g --executor-cores 5 --num-executors 150 > --files /home/renu_yadav/fmyapp/hive-site.xml --conf > spark.yarn.preserve.staging.files=true --conf > spark.shuffle.memoryFraction=0.6 --conf spark.storage.memoryFraction=0.1 > --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf > SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m" --conf > spark.akka.timeout=40 --conf spark.locality.wait=10 --conf > spark.yarn.executor.memoryOverhead=8000 --conf > SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps" > --conf spark.reducer.maxMbInFlight=96 --conf > spark.shuffle.file.buffer.kb=64 --conf > spark.core.connection.ack.wait.timeout=120 --jars > /usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar > myapp_2.10-1.0.jar > > > > > > > > Cluster configuration > > 20 Nodes > 32 cores per node > 125 GB ram per node > > > Please Help. > > Thanks & Regards, > Renu Yadav > >
Re: Spark Job Failed (Executor Lost then FS closed)
Can you look more in the worker logs and see whats going on? It looks like a memory issue (kind of GC overhead etc., You need to look in the worker logs) Thanks Best Regards On Fri, Aug 7, 2015 at 3:21 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: Re attaching the images. On Thu, Aug 6, 2015 at 2:50 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: Code: import java.text.SimpleDateFormat import java.util.Calendar import java.sql.Date import org.apache.spark.storage.StorageLevel def extract(array: Array[String], index: Integer) = { if (index array.length) { array(index).replaceAll(\, ) } else { } } case class GuidSess( guid: String, sessionKey: String, sessionStartDate: String, siteId: String, eventCount: String, browser: String, browserVersion: String, operatingSystem: String, experimentChannel: String, deviceName: String) val rowStructText = sc.textFile(/user/zeppelin/guidsess/2015/08/05/part-m-1.gz) val guidSessRDD = rowStructText.filter(s = s.length != 1).map(s = s.split(,)).map( { s = GuidSess(extract(s, 0), extract(s, 1), extract(s, 2), extract(s, 3), extract(s, 4), extract(s, 5), extract(s, 6), extract(s, 7), extract(s, 8), extract(s, 9)) }) val guidSessDF = guidSessRDD.toDF() guidSessDF.registerTempTable(guidsess) Once the temp table is created, i wrote this query select siteid, count(distinct guid) total_visitor, count(sessionKey) as total_visits from guidsess group by siteid *Metrics:* Data Size: 170 MB Spark Version: 1.3.1 YARN: 2.7.x Timeline: There is 1 Job, 2 stages with 1 task each. *1st Stage : mapPartitions* [image: Inline image 1] 1st Stage: Task 1 started to fail. A second attempt started for 1st task of first Stage. The first attempt failed Executor LOST when i go to YARN resource manager and go to that particular host, i see that its running fine. *Attempt #1* [image: Inline image 2] *Attempt #2* Executor LOST AGAIN [image: Inline image 3] *Attempt 34* *[image: Inline image 4]* *2nd Stage runJob : SKIPPED* *[image: Inline image 5]* Any suggestions ? -- Deepak -- Deepak - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Spark Job Failed (Executor Lost then FS closed)
Code: import java.text.SimpleDateFormat import java.util.Calendar import java.sql.Date import org.apache.spark.storage.StorageLevel def extract(array: Array[String], index: Integer) = { if (index array.length) { array(index).replaceAll(\, ) } else { } } case class GuidSess( guid: String, sessionKey: String, sessionStartDate: String, siteId: String, eventCount: String, browser: String, browserVersion: String, operatingSystem: String, experimentChannel: String, deviceName: String) val rowStructText = sc.textFile(/user/zeppelin/guidsess/2015/08/05/part-m-1.gz) val guidSessRDD = rowStructText.filter(s = s.length != 1).map(s = s.split(,)).map( { s = GuidSess(extract(s, 0), extract(s, 1), extract(s, 2), extract(s, 3), extract(s, 4), extract(s, 5), extract(s, 6), extract(s, 7), extract(s, 8), extract(s, 9)) }) val guidSessDF = guidSessRDD.toDF() guidSessDF.registerTempTable(guidsess) Once the temp table is created, i wrote this query select siteid, count(distinct guid) total_visitor, count(sessionKey) as total_visits from guidsess group by siteid *Metrics:* Data Size: 170 MB Spark Version: 1.3.1 YARN: 2.7.x Timeline: There is 1 Job, 2 stages with 1 task each. *1st Stage : mapPartitions* [image: Inline image 1] 1st Stage: Task 1 started to fail. A second attempt started for 1st task of first Stage. The first attempt failed Executor LOST when i go to YARN resource manager and go to that particular host, i see that its running fine. *Attempt #1* [image: Inline image 2] *Attempt #2* Executor LOST AGAIN [image: Inline image 3] *Attempt 34* *[image: Inline image 4]* *2nd Stage runJob : SKIPPED* *[image: Inline image 5]* Any suggestions ? -- Deepak
Re: Spark Job Failed - Class not serializable
This thread might give you some insights http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E Thanks Best Regards On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak
Re: Spark Job Failed - Class not serializable
I was able to write record that extends specificrecord (avro) this class was not auto generated. Do we need to do something extra for auto generated classes Sent from my iPhone On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote: This thread might give you some insights http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E Thanks Best Regards On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak
Re: Spark Job Failed - Class not serializable
I meant that I did not have to use kyro. Why will kyro help fix this issue now ? Sent from my iPhone On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote: I was able to write record that extends specificrecord (avro) this class was not auto generated. Do we need to do something extra for auto generated classes Sent from my iPhone On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote: This thread might give you some insights http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E Thanks Best Regards On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak
Re: Spark Job Failed - Class not serializable
Because, its throwing up serializable exceptions and kryo is a serializer to serialize your objects. Thanks Best Regards On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain deepuj...@gmail.com wrote: I meant that I did not have to use kyro. Why will kyro help fix this issue now ? Sent from my iPhone On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote: I was able to write record that extends specificrecord (avro) this class was not auto generated. Do we need to do something extra for auto generated classes Sent from my iPhone On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote: This thread might give you some insights http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E Thanks Best Regards On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak
Spark Job Failed - Class not serializable
My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak
Re: Spark Job Failed - Class not serializable
You’ll definitely want to use a Kryo-based serializer for Avro. We have a Kryo based serializer that wraps the Avro efficient serializer here. Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Apr 3, 2015, at 5:41 AM, Akhil Das ak...@sigmoidanalytics.com wrote: Because, its throwing up serializable exceptions and kryo is a serializer to serialize your objects. Thanks Best Regards On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain deepuj...@gmail.com wrote: I meant that I did not have to use kyro. Why will kyro help fix this issue now ? Sent from my iPhone On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote: I was able to write record that extends specificrecord (avro) this class was not auto generated. Do we need to do something extra for auto generated classes Sent from my iPhone On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote: This thread might give you some insights http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E Thanks Best Regards On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote: My Spark Job failed with 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: - object not serializable (class: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null}) - field (class: scala.Tuple2, name: _2, type: class java.lang.Object) - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: null, currPsLvlId: null})) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum Serialization stack: com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto generated through avro schema using avro-generate-sources maven pulgin. package com.ebay.ep.poc.spark.reporting.process.model.dw; @SuppressWarnings(all) @org.apache.avro.specific.AvroGenerated public class SpsLevelMetricSum extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord { ... ... } Can anyone suggest how to fix this ? -- Deepak