So with this... to help my understanding of Spark under the hood-

Is this statement correct "When data needs to pass between multiple JVMs, a
shuffle will *always* hit disk"?

On Wed, Jun 10, 2015 at 10:11 AM, Josh Rosen <rosenvi...@gmail.com> wrote:

> There's a discussion of this at https://github.com/apache/spark/pull/5403
>
>
>
> On Wed, Jun 10, 2015 at 7:08 AM, Corey Nolet <cjno...@gmail.com> wrote:
>
>> Is it possible to configure Spark to do all of its shuffling FULLY in
>> memory (given that I have enough memory to store all the data)?
>>
>>
>>
>>
>

Reply via email to