[ 
https://issues.apache.org/jira/browse/SPARK-26525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liupengcheng updated SPARK-26525:
---------------------------------
    Description: 
Currently, spark would not release ShuffleBlockFetcherIterator until the whole 
task finished.

In some conditions, it incurs memory leak.

An example is Shuffle -> map -> Coalesce(shuffle = false). Each 
ShuffleBlockFetcherIterator contains  some metas about 
MapStatus(blocksByAddress) and each ShuffleMapTask will keep n(max to shuffle 
partitions) shuffleBlockFetcherIterator for they are refered by 
onCompleteCallbacks of TaskContext, in some case, it may take huge memory and 
the memory will not released until the task finished.

Actually, We can release ShuffleBlockFetcherIterator as soon as it's consumed.

  was:
Currently, spark would not release ShuffleBlockFetcherIterator until the whole 
task finished.

In some conditions, it incurs memory leak.

An example is Shuffle -> map -> Coalesce(shuffle = false). Each ShuffleMapTask 
will keep n(max to shuffle partitions) shuffleBlockFetcherIterator for they are 
refered by onCompleteCallbacks of TaskContext, and each 
ShuffleBlockFetcherIterator contains  some metas about 
MapStatus(blocksByAddress), in some case, it may take huge memory and the 
memory will not released until the task finished.

Actually, We can release ShuffleBlockFetcherIterator as soon as it's consumed.


> Fast release memory of ShuffleBlockFetcherIterator
> --------------------------------------------------
>
>                 Key: SPARK-26525
>                 URL: https://issues.apache.org/jira/browse/SPARK-26525
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle
>    Affects Versions: 2.3.2
>            Reporter: liupengcheng
>            Priority: Major
>
> Currently, spark would not release ShuffleBlockFetcherIterator until the 
> whole task finished.
> In some conditions, it incurs memory leak.
> An example is Shuffle -> map -> Coalesce(shuffle = false). Each 
> ShuffleBlockFetcherIterator contains  some metas about 
> MapStatus(blocksByAddress) and each ShuffleMapTask will keep n(max to shuffle 
> partitions) shuffleBlockFetcherIterator for they are refered by 
> onCompleteCallbacks of TaskContext, in some case, it may take huge memory and 
> the memory will not released until the task finished.
> Actually, We can release ShuffleBlockFetcherIterator as soon as it's consumed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to