spark.storage.memoryFraction for shuffle-only jobs

Ruslan Dautkhanov Thu, 04 Feb 2016 19:16:10 -0800

For a Spark job that only does shuffling
(e.g. Spark SQL with joins, group bys, analytical functions, order bys),
but no explicit persistent RDDs nor dataframes (there are no .cache()es in
the code),
what would be the lowest recommended setting
for spark.storage.memoryFraction?


spark.storage.memoryFraction defaults to 0.6 which is quite huge for
shuffle-only jobs.
spark.shuffle.memoryFraction defaults to 0.2 in Spark 1.5.0.

Can I set spark.storage.memoryFraction to something low like 0.1 or even
lower?
And spark.shuffle.memoryFraction to something large like 0.9? or perhaps
even 1.0?


Thanks!

spark.storage.memoryFraction for shuffle-only jobs

Reply via email to