Shuffle Spill (Memory) greater than Shuffle Spill (Disk)

2016-09-13 Thread prayag chandran
Hello!

In my spark job, I see that Shuffle Spill (Memory) is greater than Shuffle
Spill (Disk). spark.shuffle.compress parameter is left to default(true?). I
would expect the size on disk to be smaller which isn't the case here. I've
been having some performance issues as well and I suspect this is somehow
related to that.

All memory configuration parameters are default. I'm running spark 2.0.
Shuffle Spill (Memory): 712.0 MB
Shuffle Spill (Disk): 7.9 GB

To my surprise, I also see the following for some tasks:
Shuffle Spill (Memory): 0.0 B
Shuffle Spill (Disk): 77.5 MB

I would appreciate if anyone can explain this behavior.

-Prayag


Re: subscribe

2016-01-03 Thread prayag chandran
You should email users-subscr...@kafka.apache.org if you are trying to
subscribe.

On 3 January 2016 at 11:52, Rajdeep Dua  wrote:

>
>