Re: run reduceByKey on huge data in spark

barge.nilesh Tue, 30 Jun 2015 17:29:17 -0700

"I 'm using 50 servers , 35 executors per server, 140GB memory per server"


35 executors *per server* sounds kind of odd to me.

With 35 executors per server and server having 140gb, meaning each executor
is going to get only 4gb, 4gb will be divided in to shuffle/storage memory
fractions... assuming storage memory fraction=0.6 as default then 2.4gb
working space for each executor, so if any of the partition size (key group
size) exceeds 2.4gb there will be OOM...

May be you can try with the less number of executors per server/node...






--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/run-reduceByKey-on-huge-data-in-spark-tp23546p23555.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: run reduceByKey on huge data in spark

Reply via email to