Re: Job hangs in blocked task in final parquet write stage

2018-12-11 Thread Conrad Lee
, Dec 4, 2018 at 9:45 AM Conrad Lee wrote: > Yeah, probably increasing the memory or increasing the number of output > partitions would help. However increasing memory available to each > executor would add expense. I want to keep the number of partitions low so > that each parquet fi

Re: Job hangs in blocked task in final parquet write stage

2018-12-04 Thread Conrad Lee
On Mon, Dec 3, 2018 at 2:48 AM Conrad Lee wrote: > >> Thanks for the thoughts. While the beginning of the job deals with lots >> of files in the first stage, they're first coalesced down into just a few >> thousand partitions. The part of the job that's failing is the redu

Re: Job hangs in blocked task in final parquet write stage

2018-11-29 Thread Conrad Lee
com> wrote: > I ran into problems using 5.19 so I referred to 5.17 and it resolved my > issues. > > On Wed, Nov 28, 2018 at 2:48 AM Conrad Lee wrote: > >> Hello Vadim, >> >> Interesting. I've only been running this job at scale for a couple weeks >> so I ca

Re: Job hangs in blocked task in final parquet write stage

2018-11-27 Thread Conrad Lee
ntil two weeks ago everything was fine. > We're trying to figure out with the EMR team where the issue is coming > from. > On Tue, Nov 27, 2018 at 6:29 AM Conrad Lee wrote: > > > > Dear spark community, > > > > I'm running spark 2.3.2 on EMR 5.19.0. I've got a job t

Re: Job hangs in blocked task in final parquet write stage

2018-11-27 Thread Conrad Lee
Dear spark community, I'm running spark 2.3.2 on EMR 5.19.0. I've got a job that's hanging in the final stage--the job usually works, but I see this hanging behavior in about one out of 50 runs. The second-to-last stage sorts the dataframe, and the final stage writes the dataframe to HDFS.