Yeah, just ran with 2g for that setting and max.mb with 1068

I am trying to do a map-side join by using broadcast variable. This first
collects all the data (key, value) and then sends it. Its causing error
while running this stage.

On Thu, Apr 9, 2015 at 9:29 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> Typo in previous email, pardon me.
>
> Set "spark.driver.maxResultSize" to 1068 or higher.
>
> On Thu, Apr 9, 2015 at 8:57 AM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> Please set "spark.kryoserializer.buffer.max.mb" to 1068 (or higher).
>>
>> Cheers
>>
>> On Thu, Apr 9, 2015 at 8:54 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>> wrote:
>>
>>> Pressed send early.
>>>
>>> I had tried that with these settings
>>>
>>>  buffersize=128 maxbuffersize=1024
>>>
>>>     val conf = new SparkConf()
>>>
>>>     .setAppName(detail)
>>>
>>>     .set("spark.serializer",
>>> "org.apache.spark.serializer.KryoSerializer")
>>>
>>>
>>> .set("spark.kryoserializer.buffer.mb",arguments.get("buffersize").get)
>>>
>>>
>>> .set("spark.kryoserializer.buffer.max.mb",arguments.get("maxbuffersize").get)
>>>
>>>
>>> .registerKryoClasses(Array(classOf[com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum]))
>>>
>>>
>>> On Thu, Apr 9, 2015 at 9:23 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <deepuj...@gmail.com>
>>> wrote:
>>>
>>>> Yes i had tried that.
>>>>
>>>> Now i see this
>>>>
>>>> 15/04/09 07:58:08 INFO scheduler.DAGScheduler: Job 0 failed: collect at
>>>> VISummaryDataProvider.scala:38, took 275.334991 s
>>>> 15/04/09 07:58:08 ERROR yarn.ApplicationMaster: User class threw
>>>> exception: Job aborted due to stage failure: Total size of serialized
>>>> results of 4 tasks (1067.3 MB) is bigger than spark.driver.maxResultSize
>>>> (1024.0 MB)
>>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>>> Total size of serialized results of 4 tasks (1067.3 MB) is bigger than
>>>> spark.driver.maxResultSize (1024.0 MB)
>>>> at org.apache.spark.scheduler.DAGScheduler.org
>>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191)
>>>> at
>>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)
>>>> at scala.Option.foreach(Option.scala:236)
>>>> at
>>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)
>>>> at
>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)
>>>> at
>>>> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)
>>>> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
>>>> 15/04/09 07:58:08 INFO storage.BlockManagerInfo: Removed taskresult_4
>>>> on phxaishdc9dn0579.phx.ebay.com:42771 in memory (size: 273.5 MB,
>>>> free: 6.2 GB)
>>>> 15/04/09 07:58:08 INFO yarn.ApplicationMaster: Final app status:
>>>> FAILED, exitCode: 15, (reason: User
>>>>
>>>> On Thu, Apr 9, 2015 at 8:18 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>>> Please take a look at
>>>>> https://code.google.com/p/kryo/source/browse/trunk/src/com/esotericsoftware/kryo/io/Output.java?r=236
>>>>> , starting line 27.
>>>>>
>>>>> In Spark, you can control the maxBufferSize
>>>>> with "spark.kryoserializer.buffer.max.mb"
>>>>>
>>>>> Cheers
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Deepak
>>>>
>>>>
>>>
>>>
>>> --
>>> Deepak
>>>
>>>
>>
>


-- 
Deepak

Reply via email to