[ https://issues.apache.org/jira/browse/SPARK-17465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15541946#comment-15541946 ]
Xing Shi commented on SPARK-17465: ---------------------------------- Resolved. In every task, method _currentUnrollMemory_ will be called several times. It will scan all keys of _unrollMemoryMap_ and _ pendingUnrollMemoryMap_, so the processing time is proportional to the map size. https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L540-L542 I have checked the processing time of _currentUnrollMemory_. It just equals to the time increased from before. Hope this will help someone who has a similar issue of increasing processing time when upgrade Spark to 1.6.0 :) > Inappropriate memory management in `org.apache.spark.storage.MemoryStore` may > lead to memory leak > ------------------------------------------------------------------------------------------------- > > Key: SPARK-17465 > URL: https://issues.apache.org/jira/browse/SPARK-17465 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.6.0, 1.6.1, 1.6.2 > Reporter: Xing Shi > Assignee: Xing Shi > Fix For: 1.6.3, 2.0.1, 2.1.0 > > > After updating Spark from 1.5.0 to 1.6.0, I found that it seems to have a > memory leak on my Spark streaming application. > Here is the head of the heap histogram of my application, which has been > running about 160 hours: > {code:borderStyle=solid} > num #instances #bytes class name > ---------------------------------------------- > 1: 28094 71753976 [B > 2: 1188086 28514064 java.lang.Long > 3: 1183844 28412256 scala.collection.mutable.DefaultEntry > 4: 102242 13098768 <methodKlass> > 5: 102242 12421000 <constMethodKlass> > 6: 8184 9199032 <constantPoolKlass> > 7: 38 8391584 [Lscala.collection.mutable.HashEntry; > 8: 8184 7514288 <instanceKlassKlass> > 9: 6651 4874080 <constantPoolCacheKlass> > 10: 37197 3438040 [C > 11: 6423 2445640 <methodDataKlass> > 12: 8773 1044808 java.lang.Class > 13: 36869 884856 java.lang.String > 14: 15715 848368 [[I > 15: 13690 782808 [S > 16: 18903 604896 > java.util.concurrent.ConcurrentHashMap$HashEntry > 17: 13 426192 [Lscala.concurrent.forkjoin.ForkJoinTask; > {code} > It shows that *scala.collection.mutable.DefaultEntry* and *java.lang.Long* > have unexpected big numbers of instances. In fact, the numbers started > growing at streaming process began, and keep growing proportional to total > number of tasks. > After some further investigation, I found that the problem is caused by some > inappropriate memory management in _releaseUnrollMemoryForThisTask_ and > _unrollSafely_ method of class > [org.apache.spark.storage.MemoryStore|https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala]. > In Spark 1.6.x, a _releaseUnrollMemoryForThisTask_ operation will be > processed only with the parameter _memoryToRelease_ > 0: > https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L530-L537 > But in fact, if a task successfully unrolled all its blocks in memory by > _unrollSafely_ method, the memory saved in _unrollMemoryMap_ would be set to > zero: > https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/storage/MemoryStore.scala#L322 > So the result is, the memory saved in _unrollMemoryMap_ will be released, but > the key of that part of memory will never be removed from the hash map. The > hash table will keep increasing, while new tasks keep incoming. Although the > speed of increase is comparatively slow (about dozens of bytes per task), it > is possible that result into OOM after weeks or months. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org