jiangxb1987 commented on code in PR #40690: URL: https://github.com/apache/spark/pull/40690#discussion_r1160971328
########## core/src/main/scala/org/apache/spark/MapOutputTracker.scala: ########## @@ -157,22 +164,29 @@ private class ShuffleStatus( invalidateSerializedMapOutputStatusCache() } mapStatuses(mapIndex) = status + mapIdToMapIndex(status.mapId) = mapIndex } /** * Update the map output location (e.g. during migration). */ def updateMapOutput(mapId: Long, bmAddress: BlockManagerId): Unit = withWriteLock { try { - val mapStatusOpt = mapStatuses.find(x => x != null && x.mapId == mapId) + // OpenHashMap would return 0 if the key doesn't exist. + val mapIndex = if (mapIdToMapIndex.contains(mapId)) { + Some(mapIdToMapIndex(mapId)) + } else { + None + } Review Comment: The problem is OpenHashMap casts null value to the declared type internally. Thus if the key doesn't exist, it would cast `null` to Int, and the output would be 0. It is not possible for us tell whether the key doesn't exist or the value is really 0, without calling the `contains` function. I'll try to follow the first option to add a `get(key): Option[V]`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org