jin xing created SPARK-24143:
--------------------------------

             Summary: filter empty blocks when convert mapstatus to (blockId, 
size) pair
                 Key: SPARK-24143
                 URL: https://issues.apache.org/jira/browse/SPARK-24143
             Project: Spark
          Issue Type: New Feature
          Components: Spark Core
    Affects Versions: 2.3.0
            Reporter: jin xing


In current code(MapOutputTracker.convertMapStatuses), mapstatus are converted 
to (blockId, size) pair for all blocks -- no matter the block is empty or not, 
which result in OOM when there are lots of consecutive empty blocks, especially 
when adaptive execution is enabled.

(blockId, size) pair is only used in ShuffleBlockFetcherIterator to control 
shuffle-read and only non-empty block request is sent. Can we just filter out 
the empty blocks in  MapOutputTracker.convertMapStatuses and save memory?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to