Fwd: Spark on EMR suddenly stalling

2017-12-29 Thread Jeroen Miller
Hello, Just a quick update as I did not made much progress yet. On 28 Dec 2017, at 21:09, Gourav Sengupta wrote: > can you try to then use the EMR version 5.10 instead or EMR version 5.11 > instead? Same issue with EMR 5.11.0. Task 0 in one stage never finishes. >

Fwd: Spark on EMR suddenly stalling

2017-12-28 Thread Jeroen Miller
On 28 Dec 2017, at 19:25, Patrick Alwell wrote: > You are using groupByKey() have you thought of an alternative like > aggregateByKey() or combineByKey() to reduce shuffling? I am aware of this indeed. I do have a groupByKey() that is difficult to avoid, but the