Dejiu Lu created SPARK-55963:
--------------------------------
Summary: Optimize snapshot traversal in ExecutorPodsAllocator
Key: SPARK-55963
URL: https://issues.apache.org/jira/browse/SPARK-55963
Project: Spark
Issue Type: Improvement
Components: Kubernetes
Affects Versions: 4.1.1
Reporter: Dejiu Lu
Optimize the snapshot preprocessing in `ExecutorPodsAllocator.onNewSnapshots()`.
Currently, to collect known executor IDs and PVC names, the code calls
`flatMap` + `distinct` across all snapshots. Since each snapshot is built
incrementally via `withUpdate()`, they overlap heavily . This leads to a lot of
redundant work.
We can merge all snapshots into a single aggregated map up front, which
deduplicates by executor ID naturally. PVC names are then extracted from this
deduplicated map.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]