Hi everyone,

The possibility to have in memory shuffling is discussed in this issue https://github.com/apache/spark/pull/5403. It was in 2015.

In 2016 the paper "Scaling Spark on HPC Systems" says that Spark still shuffle using disks. I would like to know :


What is the current state of in memory shuffling ?

Is it implemented in production ?

Does the current shuffle still use disks to work ?

Is it possible to somehow do it in RAM only ?


Regards,

Thomas


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to