Re: Slower performance when bigger memory?

2015-04-29 Thread Sven Krasser
On Mon, Apr 27, 2015 at 7:36 AM, Shuai Zheng szheng.c...@gmail.com wrote: Thanks. So may I know what is your configuration for more/smaller executors on r3.8xlarge, how big of the memory that you eventually decide to give one executor without impact performance (for example: 64g? ). We're

RE: Slower performance when bigger memory?

2015-04-27 Thread Shuai Zheng
PM To: Dean Wampler Cc: Shuai Zheng; user@spark.apache.org Subject: Re: Slower performance when bigger memory? FWIW, I ran into a similar issue on r3.8xlarge nodes and opted for more/smaller executors. Another observation was that one large executor results in less overall read throughput from

RE: Slower performance when bigger memory?

2015-04-24 Thread Evo Eftimov
You can resort to Serialized storage (still in memory) of your RDDs - this will obviate the need for GC since the RDD elements are stored as serialized objects off the JVM heap (most likely in Tachion which is distributed in memory files system used by Spark internally) Also review the Object

Re: Slower performance when bigger memory?

2015-04-24 Thread Shawn Zheng
this is not about gc issue itself. The memory is On Friday, April 24, 2015, Evo Eftimov evo.efti...@isecc.com wrote: You can resort to Serialized storage (still in memory) of your RDDs – this will obviate the need for GC since the RDD elements are stored as serialized objects off the JVM heap

Re: Slower performance when bigger memory?

2015-04-24 Thread Sven Krasser
FWIW, I ran into a similar issue on r3.8xlarge nodes and opted for more/smaller executors. Another observation was that one large executor results in less overall read throughput from S3 (using Amazon's EMRFS implementation) in case that matters to your application. -Sven On Thu, Apr 23, 2015 at

Re: Slower performance when bigger memory?

2015-04-23 Thread Ted Yu
Shuai: Please take a look at: http://blog.takipi.com/garbage-collectors-serial-vs-parallel-vs-cms-vs-the-g1-and-whats-new-in-java-8/ On Apr 23, 2015, at 10:18 AM, Dean Wampler deanwamp...@gmail.com wrote: JVM's often have significant GC overhead with heaps bigger than 64GB. You might try