Re: Spark Executor OOMs when writing Parquet

2020-01-17 Thread Chris Teoh
ote: > > Yes, mostly memory spills though (36.9 TiB memory, 895 GiB disk). I was > under the impression that memory spill is OK? > > > (If you're wondering, this is EMR). > > -- > *From:* Chris Teoh > *Sent:* January 17, 2020 10:30 AM >

Re: Spark Executor OOMs when writing Parquet

2020-01-17 Thread Arwin Tio
To: Arwin Tio Cc: user @spark Subject: Re: Spark Executor OOMs when writing Parquet You also have disk spill which is a performance hit. Try multiplying the number of partitions by about 20x - 40x and see if you can eliminate shuffle spill. On Fri, 17 Jan 2020, 10:37 pm Arwin Tio, mailto:arwin