Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/21369#discussion_r189438190 --- Diff: core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala --- @@ -585,17 +592,15 @@ class ExternalAppendOnlyMap[K, V, C]( } else { logInfo(s"Task ${context.taskAttemptId} force spilling in-memory map to disk and " + s"it will release ${org.apache.spark.util.Utils.bytesToString(getUsed())} memory") - nextUpstream = spillMemoryIteratorToDisk(upstream) + val nextUpstream = spillMemoryIteratorToDisk(upstream) + assert(!upstream.hasNext) hasSpilled = true + upstream = nextUpstream --- End diff -- Does the change means we should reassign `upstream` (which eliminates reference to `currentMap`) after spill **immediately**, otherwise, we may hit OOM (e.g. never `readNext()` after spill - is this the real cause for JIRA issue?) ?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org