Few things to keep in mind:
- I believe Driver memory should not exceed executor memory
- Set spark.storage.memoryFraction default is 0.6
- Set spark.rdd.compress default is set to false
- Always specify the level of parallelism while doing a groupBy, reduceBy,
join, sortBy etc.
- If you don't
Hi Akhil,
Thanks for the response..
Another query... do you know how to use spark.executor.extraJavaOptions
option?
SparkConf.set(spark.executor.extraJavaOptions,what value should go in
here)?
I am trying to find an example but cannot seem to find the same..
On Mon, Oct 13, 2014 at 12:03 AM,
like this you can set:
sparkConf.set(spark.executor.extraJavaOptions, -XX:+UseCompressedOops
-XX:+UseConcMarkSweepGC -XX:+AggressiveOpts -XX:FreqInlineSize=300
-XX:MaxInlineSize=300 )
Here's a benchmark example
Cool.. Thanks.. And one last final question..
conf = SparkConf.set().set(...)
matrix = get_data(..)
rdd = sc.parallelize(matrix) # heap error here...
How and where do I set set the storage level.. seems like conf is the wrong
place to set this thing up..?? as I get this error:
Like this:
import org.apache.spark.storage.StorageLevel
val rdd = sc.parallelize(1 to
100).persist(StorageLevel.MEMORY_AND_DISK_SER)
Thanks
Best Regards
On Mon, Oct 13, 2014 at 12:50 PM, Chengi Liu chengi.liu...@gmail.com
wrote:
Cool.. Thanks.. And one last final question..
conf =
Hi,
I am trying to use spark but I am having hard time configuring the
sparkconf...
My current conf is
conf =
SparkConf().set(spark.executor.memory,10g).set(spark.akka.frameSize,
1).set(spark.driver.memory,16g)
but I still see the java heap size error
14/10/12 09:54:50 ERROR Executor: