Sure folks, will try later today!
Best Regards
Ankit Khettry
On Sat, 7 Sep, 2019, 6:56 PM Sunil Kalra, wrote:
> Ankit
>
> Can you try reducing number of cores or increasing memory. Because with
> below configuration your each core is getting ~3.5 GB. Otherwise your data
> is skewed, that one
Ankit
Can you try reducing number of cores or increasing memory. Because with
below configuration your each core is getting ~3.5 GB. Otherwise your data
is skewed, that one of cores is getting too much data based key.
spark.executor.cores 6 spark.executor.memory 36g
On Sat, Sep 7, 2019 at 6:35
It says you have 3811 tasks in earlier stages and you're going down to 2001
partitions, that would make it more memory intensive. I'm guessing the
default spark shuffle partition was 200 so that would have failed. Go for
higher number, maybe even higher than 3811. What was your shuffle write
from
You can try, consider processing each partition separately if your data is
heavily skewed when you partition it.
On Sat, 7 Sep 2019, 7:19 pm Ankit Khettry, wrote:
> Thanks Chris
>
> Going to try it soon by setting maybe spark.sql.shuffle.partitions to
> 2001. Also, I was wondering if it would
Thanks Chris
Going to try it soon by setting maybe spark.sql.shuffle.partitions to 2001.
Also, I was wondering if it would help if I repartition the data by the
fields I am using in group by and window operations?
Best Regards
Ankit Khettry
On Sat, 7 Sep, 2019, 1:05 PM Chris Teoh, wrote:
> Hi
Hi Ankit,
Without looking at the Spark UI and the stages/DAG, I'm guessing you're
running on default number of Spark shuffle partitions.
If you're seeing a lot of shuffle spill, you likely have to increase the
number of shuffle partitions to accommodate the huge shuffle size.
I hope that helps
Nope, it's a batch job.
Best Regards
Ankit Khettry
On Sat, 7 Sep, 2019, 6:52 AM Upasana Sharma, <028upasana...@gmail.com>
wrote:
> Is it a streaming job?
>
> On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry
> wrote:
>
>> I have a Spark job that consists of a large number of Window operations
>> and
Is it a streaming job?
On Sat, Sep 7, 2019, 5:04 AM Ankit Khettry wrote:
> I have a Spark job that consists of a large number of Window operations
> and hence involves large shuffles. I have roughly 900 GiBs of data,
> although I am using a large enough cluster (10 * m5.4xlarge instances). I
>
Did you set `--driver-memory` with spark-submit? -Xiangrui
On Mon, May 4, 2015 at 5:16 PM, Vinay Muttineni vmuttin...@ebay.com wrote:
Hi, I am training a GMM with 10 gaussians on a 4 GB dataset(720,000 * 760).
The spark (1.3.1) job is allocated 120 executors with 6GB each and the
driver also
Thanks for the pointer it led me to
http://spark.apache.org/docs/1.2.0/tuning.html increasing parallelism
resolved the issue.
On Mon, Feb 16, 2015 at 11:57 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
Increase your executor memory, Also you can play around with increasing
the number of
Increase your executor memory, Also you can play around with increasing the
number of partitions/parallelism etc.
Thanks
Best Regards
On Tue, Feb 17, 2015 at 3:39 AM, Harshvardhan Chauhan ha...@gumgum.com
wrote:
Hi All,
I need some help with Out Of Memory errors in my application. I am
11 matches
Mail list logo