For a Spark job that only does shuffling
(e.g. Spark SQL with joins, group bys, analytical functions, order bys),
but no explicit persistent RDDs nor dataframes (there are no .cache()es in
the code),
what would be the lowest recommended setting
for spark.storage.memoryFraction?

spark.storage.memoryFraction defaults to 0.6 which is quite huge for
shuffle-only jobs.
spark.shuffle.memoryFraction defaults to 0.2 in Spark 1.5.0.

Can I set spark.storage.memoryFraction to something low like 0.1 or even
lower?
And spark.shuffle.memoryFraction to something large like 0.9? or perhaps
even 1.0?


Thanks!

Reply via email to