Re: How to estimate the executor memory size according by the data

2021-12-23 Thread Gourav Sengupta
Hi,

just trying to understand:
1.  Are you using JDBC to consume data from HIVE?
2. Or are you reading data directly from S3 and just using HIVE Metastore
in SPARK just to find out where the table is stored and its metadata?

Regards,
Gourav Sengupta

On Thu, Dec 23, 2021 at 2:13 PM Arthur Li  wrote:

> Dear experts,
>
> Recently there’s some OOM issue in my demo jobs which consuming data from
> the hive database, and I know I can increase the executor memory size to
> eliminate the OOM error. While I don’t know how to do the executor memory
> assessment and how to automatically adopt the executor memory size by the
> data size.
>
> Any options I appreciated.
> Arthur Li
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


RE: How to estimate the executor memory size according by the data

2021-12-23 Thread Luca Canali
Hi Arthur,

If you are using Spark 3.x you can use executor metrics for memory 
instrumentation.  
Metrics are available on the WebUI, see 
https://spark.apache.org/docs/latest/web-ui.html#stage-detail (search for Peak 
execution memory).  
Memory execution metrics are available also in the REST API and the Spark 
metrics system, see https://spark.apache.org/docs/latest/monitoring.html  
Further information on the topic also at 
https://db-blog.web.cern.ch/blog/luca-canali/2020-08-spark3-memory-monitoring  
  
Best,
Luca

-Original Message-
From: Arthur Li  
Sent: Thursday, December 23, 2021 15:11
To: user@spark.apache.org
Subject: How to estimate the executor memory size according by the data

Dear experts,

Recently there’s some OOM issue in my demo jobs which consuming data from the 
hive database, and I know I can increase the executor memory size to eliminate 
the OOM error. While I don’t know how to do the executor memory assessment and 
how to automatically adopt the executor memory size by the data size.

Any options I appreciated.
Arthur Li

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



How to estimate the executor memory size according by the data

2021-12-23 Thread Arthur Li
Dear experts,

Recently there’s some OOM issue in my demo jobs which consuming data from the 
hive database, and I know I can increase the executor memory size to eliminate 
the OOM error. While I don’t know how to do the executor memory assessment and 
how to automatically adopt the executor memory size by the data size.

Any options I appreciated.
Arthur Li

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org