[ https://issues.apache.org/jira/browse/SPARK-41053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gengliang Wang updated SPARK-41053: ----------------------------------- Summary: Better Spark UI scalability and Driver stability for large applications (was: Support disk-based KV store in Spark live UI) > Better Spark UI scalability and Driver stability for large applications > ----------------------------------------------------------------------- > > Key: SPARK-41053 > URL: https://issues.apache.org/jira/browse/SPARK-41053 > Project: Spark > Issue Type: Umbrella > Components: Spark Core, Web UI > Affects Versions: 3.4.0 > Reporter: Gengliang Wang > Priority: Major > > The current architecture of Spark live UI and Spark history server(SHS) is > too simple to serve large clusters and heavy workloads: > * Spark stores all the live UI date in memory. The size can be a few GBs and > affects the driver's stability (OOM). > * There is a limitation of storing 1000 queries only. Note that we can’t > simply increase the limitation under the current Architecture. I did a memory > profiling. Storing one query execution detail can take 800KB while storing > one task requires 0.3KB. So for 1000 SQL queries with 1000* 2000 tasks, the > memory usage for query execution and task data will be 1.4GB. Spark UI stores > UI data for jobs/stages/executors as well. So to store 10k queries, it may > take more than 14GB. > * SHS has to parse JSON format event log for the initial start. The > uncompressed event logs can be as big as a few GBs, and the parse can be > quite slow. Some users reported they had to wait for more than half an hour. > > The proposal is to: > # Store all the live UI data in local RocksDB with protobuf serialization. > # The RocksDB files of live UI can be used on SHS directly. > # If the RocksDB file is unavailable for SHS, event logs can be written with > protobuf for faster replay. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org