Copilot commented on code in PR #3662:
URL: https://github.com/apache/celeborn/pull/3662#discussion_r3072507383


##########
client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala:
##########
@@ -2047,6 +2050,9 @@ class LifecycleManager(val appUniqueId: String, val conf: 
CelebornConf) extends
         .asScala
         .keys
         .filter(!commitManager.isStageEnd(_))
+        .filter(celebornShuffleIdToAppShuffleIdMap.contains(_))
+        .map(celebornShuffleIdToAppShuffleIdMap.get(_))

Review Comment:
   `ConcurrentHashMap.contains(...)` checks for a *value* (deprecated alias of 
`containsValue`), not a key. Here we need to filter by whether the Celeborn 
shuffleId exists as a key in `celebornShuffleIdToAppShuffleIdMap`; otherwise 
quota cancellation will frequently skip active shuffles. Use `containsKey` (or 
a single `get` + null/Option check) before mapping to the appShuffleId.
   ```suggestion
           .flatMap(shuffleId => 
Option(celebornShuffleIdToAppShuffleIdMap.get(shuffleId)))
   ```



##########
client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala:
##########
@@ -109,6 +109,7 @@ class LifecycleManager(val appUniqueId: String, val conf: 
CelebornConf) extends
   private val shuffleIdMapping = JavaUtils.newConcurrentHashMap[
     Int,
     scala.collection.mutable.LinkedHashMap[String, (Int, Boolean)]]()
+  private val celebornShuffleIdToAppShuffleIdMap = 
JavaUtils.newConcurrentHashMap[Int, Int]()
   private val shuffleIdGenerator = new AtomicInteger(0)

Review Comment:
   `celebornShuffleIdToAppShuffleIdMap` is only ever added to, and it is not 
cleared when shuffles expire/unregister (e.g., `removeExpiredShuffle` removes 
`shuffleAllocatedWorkers`, `latestPartitionLocation`, etc. but not this map). 
In long-running drivers this can grow without bound; consider removing the 
mapping when a shuffle is expired/removed (and/or when unregistering app 
shuffles).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to