You can resort to Serialized storage (still in memory) of your RDDs - this will obviate the need for GC since the RDD elements are stored as serialized objects off the JVM heap (most likely in Tachion which is distributed in memory files system used by Spark internally)
Also review the Object Oriented Model of your RDD to see whether it consists of too many redundant objects and multiple levels of hierarchy - in high performance computing and distributed cluster object oriented frameworks like Spark some of the "OO Patterns" represent unnecessary burden .. From: Shuai Zheng [mailto:szheng.c...@gmail.com] Sent: Thursday, April 23, 2015 6:14 PM To: user@spark.apache.org Subject: Slower performance when bigger memory? Hi All, I am running some benchmark on r3*8xlarge instance. I have a cluster with one master (no executor on it) and one slave (r3*8xlarge). My job has 1000 tasks in stage 0. R3*8xlarge has 244G memory and 32 cores. If I create 4 executors, each has 8 core+50G memory, each task will take around 320s-380s. And if I only use one big executor with 32 cores and 200G memory, each task will take 760s-900s. And I check the log, looks like the minor GC takes much longer when using 200G memory: 285.242: [GC [PSYoungGen: 29027310K->8646087K(31119872K)] 38810417K->19703013K(135977472K), 11.2509770 secs] [Times: user=38.95 sys=120.65, real=11.25 secs] And when it uses 50G memory, the minor GC takes only less than 1s. I try to see what is the best way to configure the Spark. For some special reason, I tempt to use a bigger memory on single executor if no significant penalty on performance. But now looks like it is? Anyone has any idea? Regards, Shuai