Re: setting heap space

2014-10-13 Thread Akhil Das
Few things to keep in mind: - I believe Driver memory should not exceed executor memory - Set spark.storage.memoryFraction default is 0.6 - Set spark.rdd.compress default is set to false - Always specify the level of parallelism while doing a groupBy, reduceBy, join, sortBy etc. - If you don't

Re: setting heap space

2014-10-13 Thread Chengi Liu
Hi Akhil, Thanks for the response.. Another query... do you know how to use spark.executor.extraJavaOptions option? SparkConf.set(spark.executor.extraJavaOptions,what value should go in here)? I am trying to find an example but cannot seem to find the same.. On Mon, Oct 13, 2014 at 12:03 AM,

Re: setting heap space

2014-10-13 Thread Akhil Das
like this you can set: sparkConf.set(spark.executor.extraJavaOptions, -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:FreqInlineSize=300 -XX:MaxInlineSize=300 ) Here's a benchmark example

Re: setting heap space

2014-10-13 Thread Chengi Liu
Cool.. Thanks.. And one last final question.. conf = SparkConf.set().set(...) matrix = get_data(..) rdd = sc.parallelize(matrix) # heap error here... How and where do I set set the storage level.. seems like conf is the wrong place to set this thing up..?? as I get this error:

Re: setting heap space

2014-10-13 Thread Akhil Das
Like this: import org.apache.spark.storage.StorageLevel val rdd = sc.parallelize(1 to 100).persist(StorageLevel.MEMORY_AND_DISK_SER) Thanks Best Regards On Mon, Oct 13, 2014 at 12:50 PM, Chengi Liu chengi.liu...@gmail.com wrote: Cool.. Thanks.. And one last final question.. conf =

setting heap space

2014-10-12 Thread Chengi Liu
Hi, I am trying to use spark but I am having hard time configuring the sparkconf... My current conf is conf = SparkConf().set(spark.executor.memory,10g).set(spark.akka.frameSize, 1).set(spark.driver.memory,16g) but I still see the java heap size error 14/10/12 09:54:50 ERROR Executor: