Typo in previous email, pardon me. Set "spark.driver.maxResultSize" to 1068 or higher.
On Thu, Apr 9, 2015 at 8:57 AM, Ted Yu <yuzhih...@gmail.com> wrote: > Please set "spark.kryoserializer.buffer.max.mb" to 1068 (or higher). > > Cheers > > On Thu, Apr 9, 2015 at 8:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> wrote: > >> Pressed send early. >> >> I had tried that with these settings >> >> buffersize=128 maxbuffersize=1024 >> >> val conf = new SparkConf() >> >> .setAppName(detail) >> >> .set("spark.serializer", >> "org.apache.spark.serializer.KryoSerializer") >> >> >> .set("spark.kryoserializer.buffer.mb",arguments.get("buffersize").get) >> >> >> .set("spark.kryoserializer.buffer.max.mb",arguments.get("maxbuffersize").get) >> >> >> .registerKryoClasses(Array(classOf[com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum])) >> >> >> On Thu, Apr 9, 2015 at 9:23 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com> >> wrote: >> >>> Yes i had tried that. >>> >>> Now i see this >>> >>> 15/04/09 07:58:08 INFO scheduler.DAGScheduler: Job 0 failed: collect at >>> VISummaryDataProvider.scala:38, took 275.334991 s >>> 15/04/09 07:58:08 ERROR yarn.ApplicationMaster: User class threw >>> exception: Job aborted due to stage failure: Total size of serialized >>> results of 4 tasks (1067.3 MB) is bigger than spark.driver.maxResultSize >>> (1024.0 MB) >>> org.apache.spark.SparkException: Job aborted due to stage failure: Total >>> size of serialized results of 4 tasks (1067.3 MB) is bigger than >>> spark.driver.maxResultSize (1024.0 MB) >>> at org.apache.spark.scheduler.DAGScheduler.org >>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) >>> at >>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >>> at >>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >>> at >>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >>> at scala.Option.foreach(Option.scala:236) >>> at >>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) >>> at >>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) >>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >>> 15/04/09 07:58:08 INFO storage.BlockManagerInfo: Removed taskresult_4 on >>> phxaishdc9dn0579.phx.ebay.com:42771 in memory (size: 273.5 MB, free: >>> 6.2 GB) >>> 15/04/09 07:58:08 INFO yarn.ApplicationMaster: Final app status: FAILED, >>> exitCode: 15, (reason: User >>> >>> On Thu, Apr 9, 2015 at 8:18 PM, Ted Yu <yuzhih...@gmail.com> wrote: >>> >>>> Please take a look at >>>> https://code.google.com/p/kryo/source/browse/trunk/src/com/esotericsoftware/kryo/io/Output.java?r=236 >>>> , starting line 27. >>>> >>>> In Spark, you can control the maxBufferSize >>>> with "spark.kryoserializer.buffer.max.mb" >>>> >>>> Cheers >>>> >>> >>> >>> >>> -- >>> Deepak >>> >>> >> >> >> -- >> Deepak >> >> >