Re: Spark job failed

2015-09-14 Thread Ted Yu
Have you considered posting on vendor forum ?

FYI

On Mon, Sep 14, 2015 at 6:09 AM, Renu Yadav  wrote:

>
> -- Forwarded message --
> From: Renu Yadav 
> Date: Mon, Sep 14, 2015 at 4:51 PM
> Subject: Spark job failed
> To: d...@spark.apache.org
>
>
> I am getting below error while running spark job:
>
> storage.DiskBlockObjectWriter: Uncaught exception while reverting partial
> writes to file
> /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
> java.io.FileNotFoundException:
> /data/vol5/hadoop/yarn/local/usercache/renu_yadav/appcache/application_1438196554863_31545/spark-4686a622-82be-418e-a8b0-1653458bc8cb/22/temp_shuffle_8c437ba7-55d2-4520-80ec-adcfe932b3bd
> (No such file or directory
>
>
>
> I am running 1.3TB data
> following are the transformation
>
> read from hadoop->map(key/value).coalease(2000).groupByKey.
> then sorting each record by server_ts and select most recent
>
> saving data into parquet.
>
>
> Following is the command
> spark-submit --class com.test.Myapp--master yarn-cluster  --driver-memory
> 16g  --executor-memory 20g --executor-cores 5   --num-executors 150
> --files /home/renu_yadav/fmyapp/hive-site.xml --conf
> spark.yarn.preserve.staging.files=true --conf
> spark.shuffle.memoryFraction=0.6  --conf spark.storage.memoryFraction=0.1
> --conf SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"  --conf
> SPARK_SUBMIT_OPTS="-XX:MaxPermSize=768m"   --conf
> spark.akka.timeout=40  --conf spark.locality.wait=10 --conf
> spark.yarn.executor.memoryOverhead=8000   --conf
> SPARK_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
> --conf spark.reducer.maxMbInFlight=96  --conf
> spark.shuffle.file.buffer.kb=64 --conf
> spark.core.connection.ack.wait.timeout=120  --jars
> /usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-core-3.2.10.jar,/usr/hdp/2.2.6.0-2800/hive/lib/datanucleus-rdbms-3.2.9.jar
> myapp_2.10-1.0.jar
>
>
>
>
>
>
>
> Cluster configuration
>
> 20 Nodes
> 32 cores per node
> 125 GB ram per node
>
>
> Please Help.
>
> Thanks & Regards,
> Renu Yadav
>
>


Re: Spark Job Failed (Executor Lost then FS closed)

2015-08-09 Thread Akhil Das
Can you look more in the worker logs and see whats going on? It looks like
a memory issue (kind of GC overhead etc., You need to look in the worker
logs)

Thanks
Best Regards

On Fri, Aug 7, 2015 at 3:21 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:

 Re attaching the images.

 On Thu, Aug 6, 2015 at 2:50 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:

 Code:
 import java.text.SimpleDateFormat
 import java.util.Calendar
 import java.sql.Date
 import org.apache.spark.storage.StorageLevel

 def extract(array: Array[String], index: Integer) = {
   if (index  array.length) {
 array(index).replaceAll(\, )
   } else {
 
   }
 }


 case class GuidSess(
   guid: String,
   sessionKey: String,
   sessionStartDate: String,
   siteId: String,
   eventCount: String,
   browser: String,
   browserVersion: String,
   operatingSystem: String,
   experimentChannel: String,
   deviceName: String)

 val rowStructText =
 sc.textFile(/user/zeppelin/guidsess/2015/08/05/part-m-1.gz)
 val guidSessRDD = rowStructText.filter(s = s.length != 1).map(s =
 s.split(,)).map(
   {
 s =
   GuidSess(extract(s, 0),
 extract(s, 1),
 extract(s, 2),
 extract(s, 3),
 extract(s, 4),
 extract(s, 5),
 extract(s, 6),
 extract(s, 7),
 extract(s, 8),
 extract(s, 9))
   })

 val guidSessDF = guidSessRDD.toDF()
 guidSessDF.registerTempTable(guidsess)

 Once the temp table is created, i wrote this query

 select siteid, count(distinct guid) total_visitor,
 count(sessionKey) as total_visits
 from guidsess
 group by siteid

 *Metrics:*

 Data Size: 170 MB
 Spark Version: 1.3.1
 YARN: 2.7.x



 Timeline:
 There is 1 Job, 2 stages with 1 task each.

 *1st Stage : mapPartitions*
 [image: Inline image 1]

 1st Stage: Task 1 started to fail. A second attempt started for 1st task
 of first Stage. The first attempt failed Executor LOST
 when i go to YARN resource manager and go to that particular host, i see
 that its running fine.

 *Attempt #1*
 [image: Inline image 2]

 *Attempt #2* Executor LOST AGAIN
 [image: Inline image 3]
 *Attempt 34*

 *[image: Inline image 4]*



 *2nd Stage runJob : SKIPPED*

 *[image: Inline image 5]*

 Any suggestions ?


 --
 Deepak




 --
 Deepak



 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Akhil Das
This thread might give you some insights
http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E

Thanks
Best Regards

On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:

 My Spark Job failed with


 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed:
 saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw
 exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0)
 had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt:
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0,
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt:
 null, currPsLvlId: null}))
 org.apache.spark.SparkException: Job aborted due to stage failure: Task
 0.0 in stage 2.0 (TID 0) had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt:
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0,
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt:
 null, currPsLvlId: null}))
 at org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
 at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at scala.Option.foreach(Option.scala:236)
 at
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
 exitCode: 15, (reason: User class threw exception: Job aborted due to stage
 failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:

 

 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto
 generated through avro schema using avro-generate-sources maven pulgin.


 package com.ebay.ep.poc.spark.reporting.process.model.dw;

 @SuppressWarnings(all)

 @org.apache.avro.specific.AvroGenerated

 public class SpsLevelMetricSum extends
 org.apache.avro.specific.SpecificRecordBase implements
 org.apache.avro.specific.SpecificRecord {
 ...
 ...
 }

 Can anyone suggest how to fix this ?



 --
 Deepak




Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Deepak Jain
I was able to write record that extends specificrecord (avro) this class was 
not auto generated. Do we need to do something extra for auto generated classes 

Sent from my iPhone

 On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote:
 
 This thread might give you some insights 
 http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
 
 Thanks
 Best Regards
 
 On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
 My Spark Job failed with
 
 
 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
 saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
 Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
 serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
  - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
  - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
  - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
  - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
  - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
  - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
  at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
  at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
  at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
  at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
  at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
  at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
  at scala.Option.foreach(Option.scala:236)
  at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
  at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
  at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
 failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 
 
 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto 
 generated through avro schema using avro-generate-sources maven pulgin.
 
 
 package com.ebay.ep.poc.spark.reporting.process.model.dw;  
 
 @SuppressWarnings(all)
 
 @org.apache.avro.specific.AvroGenerated
 
 public class SpsLevelMetricSum extends 
 org.apache.avro.specific.SpecificRecordBase implements 
 org.apache.avro.specific.SpecificRecord {
 
 ...
 ...
 }
 
 Can anyone suggest how to fix this ?
 
 
 
 -- 
 Deepak
 


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Deepak Jain
I meant that I did not have to use kyro. Why will kyro help fix this issue now ?

Sent from my iPhone

 On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote:
 
 I was able to write record that extends specificrecord (avro) this class was 
 not auto generated. Do we need to do something extra for auto generated 
 classes 
 
 Sent from my iPhone
 
 On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote:
 
 This thread might give you some insights 
 http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
 
 Thanks
 Best Regards
 
 On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
 My Spark Job failed with
 
 
 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
 saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
 Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
 serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
 failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 
 
 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto 
 generated through avro schema using avro-generate-sources maven pulgin.
 
 
 package com.ebay.ep.poc.spark.reporting.process.model.dw;  
 
 @SuppressWarnings(all)
 
 @org.apache.avro.specific.AvroGenerated
 
 public class SpsLevelMetricSum extends 
 org.apache.avro.specific.SpecificRecordBase implements 
 org.apache.avro.specific.SpecificRecord {
 
 ...
 ...
 }
 
 Can anyone suggest how to fix this ?
 
 
 
 -- 
 Deepak
 


Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Akhil Das
Because, its throwing up serializable exceptions and kryo is a serializer
to serialize your objects.

Thanks
Best Regards

On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain deepuj...@gmail.com wrote:

 I meant that I did not have to use kyro. Why will kyro help fix this issue
 now ?

 Sent from my iPhone

 On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote:

 I was able to write record that extends specificrecord (avro) this class
 was not auto generated. Do we need to do something extra for auto generated
 classes

 Sent from my iPhone

 On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote:

 This thread might give you some insights
 http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E

 Thanks
 Best Regards

 On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:

 My Spark Job failed with


 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed:
 saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw
 exception: Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0)
 had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt:
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0,
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt:
 null, currPsLvlId: null}))
 org.apache.spark.SparkException: Job aborted due to stage failure: Task
 0.0 in stage 2.0 (TID 0) had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value:
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt:
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0,
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt:
 null, currPsLvlId: null}))
 at org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
 at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at scala.Option.foreach(Option.scala:236)
 at
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED,
 exitCode: 15, (reason: User class threw exception: Job aborted due to stage
 failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result:
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:

 

 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is
 auto generated through avro schema using avro-generate-sources maven pulgin.


 package com.ebay.ep.poc.spark.reporting.process.model.dw;

 @SuppressWarnings(all)

 @org.apache.avro.specific.AvroGenerated

 public class SpsLevelMetricSum extends
 org.apache.avro.specific.SpecificRecordBase implements
 org.apache.avro.specific.SpecificRecord {
 ...
 ...
 }

 Can anyone suggest how to fix this ?



 --
 Deepak





Re: Spark Job Failed - Class not serializable

2015-04-03 Thread Frank Austin Nothaft
You’ll definitely want to use a Kryo-based serializer for Avro. We have a Kryo 
based serializer that wraps the Avro efficient serializer here.

Frank Austin Nothaft
fnoth...@berkeley.edu
fnoth...@eecs.berkeley.edu
202-340-0466

On Apr 3, 2015, at 5:41 AM, Akhil Das ak...@sigmoidanalytics.com wrote:

 Because, its throwing up serializable exceptions and kryo is a serializer to 
 serialize your objects.
 
 Thanks
 Best Regards
 
 On Fri, Apr 3, 2015 at 5:37 PM, Deepak Jain deepuj...@gmail.com wrote:
 I meant that I did not have to use kyro. Why will kyro help fix this issue 
 now ?
 
 Sent from my iPhone
 
 On 03-Apr-2015, at 5:36 pm, Deepak Jain deepuj...@gmail.com wrote:
 
 I was able to write record that extends specificrecord (avro) this class was 
 not auto generated. Do we need to do something extra for auto generated 
 classes 
 
 Sent from my iPhone
 
 On 03-Apr-2015, at 5:06 pm, Akhil Das ak...@sigmoidanalytics.com wrote:
 
 This thread might give you some insights 
 http://mail-archives.apache.org/mod_mbox/incubator-spark-user/201311.mbox/%3CCA+WVT8WXbEHac=N0GWxj-s9gqOkgG0VRL5B=ovjwexqm8ev...@mail.gmail.com%3E
 
 Thanks
 Best Regards
 
 On Fri, Apr 3, 2015 at 3:53 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
 My Spark Job failed with
 
 
 15/04/03 03:15:36 INFO scheduler.DAGScheduler: Job 0 failed: 
 saveAsNewAPIHadoopFile at AbstractInputHelper.scala:103, took 2.480175 s
 15/04/03 03:15:36 ERROR yarn.ApplicationMaster: User class threw exception: 
 Job aborted due to stage failure: Task 0.0 in stage 2.0 (TID 0) had a not 
 serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 
 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 - object not serializable (class: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum, value: 
 {userId: 0, spsPrgrmId: 0, spsSlrLevelCd: 0, spsSlrLevelSumStartDt: 
 null, spsSlrLevelSumEndDt: null, currPsLvlId: null})
 - field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
 - object (class scala.Tuple2, (0,{userId: 0, spsPrgrmId: 0, 
 spsSlrLevelCd: 0, spsSlrLevelSumStartDt: null, spsSlrLevelSumEndDt: 
 null, currPsLvlId: null}))
 at 
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
 at 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
 at scala.Option.foreach(Option.scala:236)
 at 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
 at 
 org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
 15/04/03 03:15:36 INFO yarn.ApplicationMaster: Final app status: FAILED, 
 exitCode: 15, (reason: User class threw exception: Job aborted due to stage 
 failure: Task 0.0 in stage 2.0 (TID 0) had a not serializable result: 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum
 Serialization stack:
 
 
 
 com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum is auto 
 generated through avro schema using avro-generate-sources maven pulgin.
 
 
 package com.ebay.ep.poc.spark.reporting.process.model.dw;  
 
 @SuppressWarnings(all)
 
 @org.apache.avro.specific.AvroGenerated
 
 public class SpsLevelMetricSum extends 
 org.apache.avro.specific.SpecificRecordBase implements 
 org.apache.avro.specific.SpecificRecord {
 
 ...
 ...
 }
 
 Can anyone suggest how to fix this ?
 
 
 
 -- 
 Deepak