I have a spark job running on 2 terabytes of data which creates more than
30,000 partitions. As a result, the spark job fails with the error 
"Map output statuses were 170415722 bytes which exceeds spark.akka.frameSize
52428800 bytes" (For 1 TB data)
However, when I increase the akka.frame.size to around 500 MB, the job hangs
with no further progress.

So, what is the ideal or maximum limit that i can assign akka.frame.size so
that I do not get the error of map output statuses exceeding limit for large
chunks of data ?

Is coalescing the data into smaller number of partitions the only solution
to this problem? Is there any better way than coalescing many intermediate
rdd's in program ?

My driver memory: 10G
Executor memory: 36G 
Executor memory overhead : 3G







--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Maximum-limit-for-akka-frame-size-be-greater-than-500-MB-tp20793.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to