Re: OOM Error

2019-09-07 Thread Ankit Khettry
Sure folks, will try later today! Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 6:56 PM Sunil Kalra, wrote: > Ankit > > Can you try reducing number of cores or increasing memory. Because with > below configuration your each core is getting ~3.5 GB. Otherwise your data > is skewed, that one

Re: OOM Error

2019-09-07 Thread Sunil Kalra
Ankit Can you try reducing number of cores or increasing memory. Because with below configuration your each core is getting ~3.5 GB. Otherwise your data is skewed, that one of cores is getting too much data based key. spark.executor.cores 6 spark.executor.memory 36g On Sat, Sep 7, 2019 at 6:35

Re: OOM Error

2019-09-07 Thread Chris Teoh
It says you have 3811 tasks in earlier stages and you're going down to 2001 partitions, that would make it more memory intensive. I'm guessing the default spark shuffle partition was 200 so that would have failed. Go for higher number, maybe even higher than 3811. What was your shuffle write from

Re: OOM Error

2019-09-07 Thread Chris Teoh
You can try, consider processing each partition separately if your data is heavily skewed when you partition it. On Sat, 7 Sep 2019, 7:19 pm Ankit Khettry, wrote: > Thanks Chris > > Going to try it soon by setting maybe spark.sql.shuffle.partitions to > 2001. Also, I was wondering if it would

Re: OOM Error

2019-09-07 Thread Ankit Khettry
Thanks Chris Going to try it soon by setting maybe spark.sql.shuffle.partitions to 2001. Also, I was wondering if it would help if I repartition the data by the fields I am using in group by and window operations? Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 1:05 PM Chris Teoh, wrote: > Hi

Re: OOM Error

2019-09-07 Thread Chris Teoh
Hi Ankit, Without looking at the Spark UI and the stages/DAG, I'm guessing you're running on default number of Spark shuffle partitions. If you're seeing a lot of shuffle spill, you likely have to increase the number of shuffle partitions to accommodate the huge shuffle size. I hope that helps

Re: OOM Error

2019-09-07 Thread Ankit Khettry
Nope, it's a batch job. Best Regards Ankit Khettry On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com> wrote: > Is it a streaming job? > > On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry > wrote: > >> I have a Spark job that consists of a large number of Window operations >> and

Re: OOM Error

2019-09-06 Thread Upasana Sharma
Is it a streaming job? On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry wrote: > I have a Spark job that consists of a large number of Window operations > and hence involves large shuffles. I have roughly 900 GiBs of data, > although I am using a large enough cluster (10 * m5.4xlarge instances). I >

Re: OOM error with GMMs on 4GB dataset

2015-05-06 Thread Xiangrui Meng
Did you set `--driver-memory` with spark-submit? -Xiangrui On Mon, May 4, 2015 at 5:16 PM, Vinay Muttineni vmuttin...@ebay.com wrote: Hi, I am training a GMM with 10 gaussians on a 4 GB dataset(720,000 * 760). The spark (1.3.1) job is allocated 120 executors with 6GB each and the driver also

Re: OOM error

2015-02-17 Thread Harshvardhan Chauhan
Thanks for the pointer it led me to http://spark.apache.org/docs/1.2.0/tuning.html increasing parallelism resolved the issue. On Mon, Feb 16, 2015 at 11:57 PM, Akhil Das ak...@sigmoidanalytics.com wrote: Increase your executor memory, Also you can play around with increasing the number of

Re: OOM error

2015-02-16 Thread Akhil Das
Increase your executor memory, Also you can play around with increasing the number of partitions/parallelism etc. Thanks Best Regards On Tue, Feb 17, 2015 at 3:39 AM, Harshvardhan Chauhan ha...@gumgum.com wrote: Hi All, I need some help with Out Of Memory errors in my application. I am