jiangxb1987 commented on code in PR #40690:
URL: https://github.com/apache/spark/pull/40690#discussion_r1160971328


##########
core/src/main/scala/org/apache/spark/MapOutputTracker.scala:
##########
@@ -157,22 +164,29 @@ private class ShuffleStatus(
       invalidateSerializedMapOutputStatusCache()
     }
     mapStatuses(mapIndex) = status
+    mapIdToMapIndex(status.mapId) = mapIndex
   }
 
   /**
    * Update the map output location (e.g. during migration).
    */
   def updateMapOutput(mapId: Long, bmAddress: BlockManagerId): Unit = 
withWriteLock {
     try {
-      val mapStatusOpt = mapStatuses.find(x => x != null && x.mapId == mapId)
+      // OpenHashMap would return 0 if the key doesn't exist.
+      val mapIndex = if (mapIdToMapIndex.contains(mapId)) {
+        Some(mapIdToMapIndex(mapId))
+      } else {
+        None
+      }

Review Comment:
   The problem is OpenHashMap casts null value to the declared type internally. 
Thus if the key doesn't exist, it would cast `null` to Int, and the output 
would be 0. It is not possible for us tell whether the key doesn't exist or the 
value is really 0, without calling the `contains` function.
   
   I'll try to follow the first option to add a `get(key): Option[V]`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to