Dejiu Lu created SPARK-55963:
--------------------------------

             Summary: Optimize snapshot traversal in ExecutorPodsAllocator
                 Key: SPARK-55963
                 URL: https://issues.apache.org/jira/browse/SPARK-55963
             Project: Spark
          Issue Type: Improvement
          Components: Kubernetes
    Affects Versions: 4.1.1
            Reporter: Dejiu Lu


Optimize the snapshot preprocessing in `ExecutorPodsAllocator.onNewSnapshots()`.

Currently, to collect known executor IDs and PVC names, the code calls 
`flatMap` + `distinct` across all snapshots. Since each snapshot is built 
incrementally via `withUpdate()`, they overlap heavily . This leads to a lot of 
redundant work.

We can merge all snapshots into a single aggregated map up front, which 
deduplicates by executor ID naturally. PVC names are then extracted from this 
deduplicated map. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to