Copilot commented on code in PR #3662:
URL: https://github.com/apache/celeborn/pull/3662#discussion_r3072507383
##########
client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala:
##########
@@ -2047,6 +2050,9 @@ class LifecycleManager(val appUniqueId: String, val conf:
CelebornConf) extends
.asScala
.keys
.filter(!commitManager.isStageEnd(_))
+ .filter(celebornShuffleIdToAppShuffleIdMap.contains(_))
+ .map(celebornShuffleIdToAppShuffleIdMap.get(_))
Review Comment:
`ConcurrentHashMap.contains(...)` checks for a *value* (deprecated alias of
`containsValue`), not a key. Here we need to filter by whether the Celeborn
shuffleId exists as a key in `celebornShuffleIdToAppShuffleIdMap`; otherwise
quota cancellation will frequently skip active shuffles. Use `containsKey` (or
a single `get` + null/Option check) before mapping to the appShuffleId.
```suggestion
.flatMap(shuffleId =>
Option(celebornShuffleIdToAppShuffleIdMap.get(shuffleId)))
```
##########
client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala:
##########
@@ -109,6 +109,7 @@ class LifecycleManager(val appUniqueId: String, val conf:
CelebornConf) extends
private val shuffleIdMapping = JavaUtils.newConcurrentHashMap[
Int,
scala.collection.mutable.LinkedHashMap[String, (Int, Boolean)]]()
+ private val celebornShuffleIdToAppShuffleIdMap =
JavaUtils.newConcurrentHashMap[Int, Int]()
private val shuffleIdGenerator = new AtomicInteger(0)
Review Comment:
`celebornShuffleIdToAppShuffleIdMap` is only ever added to, and it is not
cleared when shuffles expire/unregister (e.g., `removeExpiredShuffle` removes
`shuffleAllocatedWorkers`, `latestPartitionLocation`, etc. but not this map).
In long-running drivers this can grow without bound; consider removing the
mapping when a shuffle is expired/removed (and/or when unregistering app
shuffles).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]