Re: How to estimate the executor memory size according by the data
Hi, just trying to understand: 1. Are you using JDBC to consume data from HIVE? 2. Or are you reading data directly from S3 and just using HIVE Metastore in SPARK just to find out where the table is stored and its metadata? Regards, Gourav Sengupta On Thu, Dec 23, 2021 at 2:13 PM Arthur Li wrote: > Dear experts, > > Recently there’s some OOM issue in my demo jobs which consuming data from > the hive database, and I know I can increase the executor memory size to > eliminate the OOM error. While I don’t know how to do the executor memory > assessment and how to automatically adopt the executor memory size by the > data size. > > Any options I appreciated. > Arthur Li > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >
RE: How to estimate the executor memory size according by the data
Hi Arthur, If you are using Spark 3.x you can use executor metrics for memory instrumentation. Metrics are available on the WebUI, see https://spark.apache.org/docs/latest/web-ui.html#stage-detail (search for Peak execution memory). Memory execution metrics are available also in the REST API and the Spark metrics system, see https://spark.apache.org/docs/latest/monitoring.html Further information on the topic also at https://db-blog.web.cern.ch/blog/luca-canali/2020-08-spark3-memory-monitoring Best, Luca -Original Message- From: Arthur Li Sent: Thursday, December 23, 2021 15:11 To: user@spark.apache.org Subject: How to estimate the executor memory size according by the data Dear experts, Recently there’s some OOM issue in my demo jobs which consuming data from the hive database, and I know I can increase the executor memory size to eliminate the OOM error. While I don’t know how to do the executor memory assessment and how to automatically adopt the executor memory size by the data size. Any options I appreciated. Arthur Li - To unsubscribe e-mail: user-unsubscr...@spark.apache.org - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
How to estimate the executor memory size according by the data
Dear experts, Recently there’s some OOM issue in my demo jobs which consuming data from the hive database, and I know I can increase the executor memory size to eliminate the OOM error. While I don’t know how to do the executor memory assessment and how to automatically adopt the executor memory size by the data size. Any options I appreciated. Arthur Li - To unsubscribe e-mail: user-unsubscr...@spark.apache.org