How to calculate the spark.kryoserializer.buffer.max?

2023-03-26 Thread Arthur Li
Hello all,

The data is generated by the vendors, while some days, the data size will be 
very huge, and it will overflow the default  value of 
spark.kryoserializer.buffer.max,  
So how to calculate the spark.kryoserializer.buffer.max when the data size is 
changed ahead of raising the exception during the runtime?

Appreciate your any suggestions. 

BR.
Arthur Li

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



How to estimate the executor memory size according by the data

2021-12-23 Thread Arthur Li
Dear experts,

Recently there’s some OOM issue in my demo jobs which consuming data from the 
hive database, and I know I can increase the executor memory size to eliminate 
the OOM error. While I don’t know how to do the executor memory assessment and 
how to automatically adopt the executor memory size by the data size.

Any options I appreciated.
Arthur Li

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Question about relationship between number of files and initial tasks(partitions)

2019-04-03 Thread Arthur Li
Hi Sparkers,

I noticed that in my spark application, the number of tasks in the first
stage is equal to the number of files read by the application(at least for
Avro) if the number of cpu cores is less than the number of files. Though
If cpu cores are more than number of files, it's usually equal to default
parallelism number. Why is it behave like this? Would this require a lot of
resource from the driver? Is there any way we can do to decrease the number
of tasks(partitions) in the first stage without merge files before loading?

Thanks,
Arthur

-- 
IMPORTANT NOTICE:  This message, including any attachments (hereinafter 
collectively referred to as "Communication"), is intended only for the 
addressee(s) named above.  This Communication may include information that 
is privileged, confidential and exempt from disclosure under applicable 
law.  If the recipient of this Communication is not the intended recipient, 
or the employee or agent responsible for delivering this Communication to 
the intended recipient, you are notified that any dissemination, 
distribution or copying of this Communication is strictly prohibited.  If 
you have received this Communication in error, please notify the sender 
immediately by phone or email and permanently delete this Communication 
from your computer without making a copy. Thank you.