Re: Why groupBy is slow?

2015-02-18 Thread shahab
Thanks Francois for the comment and useful link. I understand the problem better now. best, /Shahab On Wed, Feb 18, 2015 at 10:36 AM, francois.garil...@typesafe.com wrote: In a nutshell : because it’s moving all of your data, compared to other operations (e.g. reduce) that summarize it in

Re: Why groupBy is slow?

2015-02-18 Thread francois . garillot
In a nutshell : because it’s moving all of your data, compared to other operations (e.g. reduce) that summarize it in one form or another before moving it. For the longer answer:

Why groupBy is slow?

2015-02-18 Thread shahab
Hi, Based on what I could see in the Spark UI, I noticed that groupBy transformation is quite slow (taking a lot of time) compared to other operations. Is there any reason that groupBy is slow? shahab