unsubscribe
In short, I don't think there is such a possibility. However, there is the
option of shutting down spark gracefully with checkpoint directory enabled.
In such a way you can re-submit the modified code which will pick up
BatchID from where it was left off, assuming the topic is the same. See the
th
Hi Gourav,
Tried increasing shuffle partitions number and higher executor memory. Both
didn’t work.
Regards
From: Gourav Sengupta
Date: Thursday, March 3, 2022 at 2:24 AM
To: Anil Dasari
Cc: Yang,Jie(INF) , user@spark.apache.org
Subject: Re: {EXT} Re: Spark Parquet write OOM
Hi,
I do not th
Hi,
Is there any way I can add/delete actions/jobs dynamically in a running
spark streaming job.
I will call an API and execute only the configured actions in the system.
Eg . In the first batch suppose there are 5 actions in the spark
application.
Now suppose some configuration is changed and on
Hi,
I do not think that you are doing anything very particularly concerning
here.
There is a setting in SPARK which limits the number of records that we can
write out at a time you can try that. The other thing that you can try is
to ensure that the number of partitions are more (just like you su
Answers in the context. Thanks.
From: Gourav Sengupta
Date: Thursday, March 3, 2022 at 12:13 AM
To: Anil Dasari
Cc: Yang,Jie(INF) , user@spark.apache.org
Subject: Re: {EXT} Re: Spark Parquet write OOM
Hi Anil,
I was trying to work out things for a while yesterday, but may need your kind
help
Unsubscribe
On Thu, Mar 3, 2022, 05:45 Basavaraj wrote:
> unsubscribe
Hi Anil,
I was trying to work out things for a while yesterday, but may need your
kind help.
Can you please share the code for the following steps?
-
Create DF from hive (from step #c)
- Deduplicate spark DF by primary key
- Write DF to s3 in parquet format
- Write metadata to s3
Regards,
Gourav