[ https://issues.apache.org/jira/browse/SPARK-8591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14627517#comment-14627517 ]
Dibyendu Bhattacharya commented on SPARK-8591: ---------------------------------------------- This will be won't fix as suggested by [~tdas] in PR > Block failed to unroll to memory should not be replicated for MEMORY_ONLY_2 > StorageLevel > ---------------------------------------------------------------------------------------- > > Key: SPARK-8591 > URL: https://issues.apache.org/jira/browse/SPARK-8591 > Project: Spark > Issue Type: Bug > Components: Streaming > Affects Versions: 1.4.0 > Reporter: Dibyendu Bhattacharya > > Block which failed to unroll to memory and returned iterator and size 0, > should not be replicated to peer node as putBlockStatus comes as > StorageLevel.NONE and BlockStatus is not reported to Master. > Primary issue here is , for StorageLevel MEMORY_ONLY_2 , if BlockManager > failed to unroll the block to memory and store to local is failed, > BlockManager still replicate the same block to Remote peer. For Spark > Streaming case , the Receivers get the PutResult from local BlockManager and > if block failed to store locally , ReceivedBlockHandler throws the > SparkException back to Receiver even though the block successfully replicated > in Remote peer by BlockManager. This leads to wastage of memory at remote > peer as that block can never be used in Streaming jobs. In case of Receiver > failed to store the block, it can retry and for every failed retry ( to store > to local) may leads to adding another unused block to remote and this may > leads to many unwanted blocks in case of high volume receivers does multiple > retry. > The fix here proposed is to stop replicating the block if store to local has > failed. This fix will prevent the scenario mentioned above and also will not > impact the RDD Partition replications ( during Cache or Persists) as RDD > CacheManager perform unrolling to memory first before attempting to store in > local memory, and this can never happen that block unroll is successful but > store to local memory has failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org