Re: Long Shuffle Read Blocked Time

2017-04-20 Thread Pradeep Gollakota
Hi All,

It appears that the bottleneck in my job was the EBS volumes. Very high i/o
wait times across the cluster. I was only using 1 volume. Increasing to 4
made it faster.

Thanks,
Pradeep

On Thu, Apr 20, 2017 at 3:12 PM, Pradeep Gollakota 
wrote:

> Hi All,
>
> I have a simple ETL job that reads some data, shuffles it and writes it
> back out. This is running on AWS EMR 5.4.0 using Spark 2.1.0.
>
> After Stage 0 completes and the job starts Stage 1, I see a huge slowdown
> in the job. The CPU usage is low on the cluster, as is the network I/O.
> From the Spark Stats, I see large values for the Shuffle Read Blocked Time.
> As an example, one of my tasks completed in 18 minutes, but spent 15
> minutes waiting for remote reads.
>
> I'm not sure why the shuffle is so slow. Are there things I can do to
> increase the performance of the shuffle?
>
> Thanks,
> Pradeep
>


Long Shuffle Read Blocked Time

2017-04-20 Thread Pradeep Gollakota
Hi All,

I have a simple ETL job that reads some data, shuffles it and writes it
back out. This is running on AWS EMR 5.4.0 using Spark 2.1.0.

After Stage 0 completes and the job starts Stage 1, I see a huge slowdown
in the job. The CPU usage is low on the cluster, as is the network I/O.
>From the Spark Stats, I see large values for the Shuffle Read Blocked Time.
As an example, one of my tasks completed in 18 minutes, but spent 15
minutes waiting for remote reads.

I'm not sure why the shuffle is so slow. Are there things I can do to
increase the performance of the shuffle?

Thanks,
Pradeep