Re: Spark In Memory Shuffle

Gourav Sengupta Wed, 17 Oct 2018 08:10:20 -0700

super duper, I also need to try this out.

On Wed, Oct 17, 2018 at 2:39 PM onmstester onmstester
<onmstes...@zoho.com.invalid> wrote:


> Hi,
> I failed to config spark for in-memory shuffle so currently just
> using linux memory mapped directory (tmpfs) as working directory of spark,
> so everything is fast
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
> ---- On Wed, 17 Oct 2018 16:41:32 +0330 *thomas lavocat
> <thomas.lavo...@univ-grenoble-alpes.fr
> <thomas.lavo...@univ-grenoble-alpes.fr>>* wrote ----
>
> Hi everyone,
>
>
> The possibility to have in memory shuffling is discussed in this issue
> https://github.com/apache/spark/pull/5403. It was in 2015.
>
> In 2016 the paper "Scaling Spark on HPC Systems" says that Spark still
> shuffle using disks. I would like to know :
>
>
> What is the current state of in memory shuffling ?
>
> Is it implemented in production ?
>
> Does the current shuffle still use disks to work ?
>
> Is it possible to somehow do it in RAM only ?
>
>
> Regards,
>
> Thomas
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
>
>

Re: Spark In Memory Shuffle

Reply via email to