Thanks, I'll try using 5.17.0. For anyone trying to debug this problem in the future: In other jobs that hang in the same manner, the thread dump didn't have any blocked threads, so that might be a red herring.
On Wed, Nov 28, 2018 at 4:34 PM Christopher Petrino < christopher.petr...@gmail.com> wrote: > I ran into problems using 5.19 so I referred to 5.17 and it resolved my > issues. > > On Wed, Nov 28, 2018 at 2:48 AM Conrad Lee <con...@parsely.com> wrote: > >> Hello Vadim, >> >> Interesting. I've only been running this job at scale for a couple weeks >> so I can't say whether this is related to recent EMR changes. >> >> Much of the EMR-specific code for spark has to do with writing files to >> s3. In this case I'm writing files to the cluster's HDFS though so my >> sense is that this is a spark issue, not an EMR (but I'm not sure). >> >> Conrad >> >> On Tue, Nov 27, 2018 at 5:21 PM Vadim Semenov <va...@datadoghq.com> >> wrote: >> >>> Hey Conrad, >>> >>> has it started happening recently? >>> >>> We recently started having some sporadic problems with drivers on EMR >>> when it gets stuck, up until two weeks ago everything was fine. >>> We're trying to figure out with the EMR team where the issue is coming >>> from. >>> On Tue, Nov 27, 2018 at 6:29 AM Conrad Lee <con...@parsely.com> wrote: >>> > >>> > Dear spark community, >>> > >>> > I'm running spark 2.3.2 on EMR 5.19.0. I've got a job that's hanging >>> in the final stage--the job usually works, but I see this hanging behavior >>> in about one out of 50 runs. >>> > >>> > The second-to-last stage sorts the dataframe, and the final stage >>> writes the dataframe to HDFS. >>> > >>> > Here you can see the executor logs, which indicate that it has >>> finished processing the task. >>> > >>> > Here you can see the thread dump from the executor that's hanging. >>> Here's the text of the blocked thread. >>> > >>> > I tried to work around this problem by enabling speculation, but >>> speculative execution never takes place. I don't know why. >>> > >>> > Can anyone here help me? >>> > >>> > Thanks, >>> > Conrad >>> >>> >>> >>> -- >>> Sent from my iPhone >>> >>