Dear spark community, I'm running spark 2.3.2 on EMR 5.19.0. I've got a job that's hanging in the final stage--the job usually works, but I see this hanging behavior in about one out of 50 runs.
The second-to-last stage sorts the dataframe, and the final stage writes the dataframe to HDFS. Here <https://pastebin.com/mpN3Nqp4>you can see the executor logs, which indicate that it has finished processing the task. Here <https://snag.gy/IyhgQ4.jpg> you can see the thread dump from the executor that's hanging. Here's <https://pastebin.com/ZK5qKXKx>the text of the blocked thread. I tried to work around this problem by enabling speculation, but speculative execution never takes place. I don't know why. Can anyone here help me? Thanks, Conrad