Saisai Shao created SPARK-2926:
----------------------------------

             Summary: Add MR-style (merge-sort) SortShuffleReader for 
sort-based shuffle
                 Key: SPARK-2926
                 URL: https://issues.apache.org/jira/browse/SPARK-2926
             Project: Spark
          Issue Type: Improvement
          Components: Shuffle
    Affects Versions: 1.1.0
            Reporter: Saisai Shao


Currently Spark has already integrated sort-based shuffle write, which greatly 
improve the IO performance and reduce the memory consumption when reducer 
number is very large. But for the reducer side, it still adopts the 
implementation of hash-based shuffle reader, which neglect the ordering 
attributes of map output data in some situations.

Here we propose a MR style sort-merge like shuffle reader for sort-based 
shuffle to better improve the performance of sort-based shuffle.

Working in progress code and performance test report will be posted later when 
some unit test bugs are fixed.

Any comments would be greatly appreciated. 
Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to