[ 
https://issues.apache.org/jira/browse/SPARK-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14025580#comment-14025580
 ] 

Matei Zaharia commented on SPARK-2044:
--------------------------------------

Hey Weihua,

I'll look into the sorting flag; I initially envisioned that the shuffle 
manager would just tell the calling code whether the data is sorted (otherwise 
it sorts it by itself), but maybe it does make sense to push sorting into the 
interface.

For the ranges on ShuffleReader, I think you misunderstood my meaning slightly. 
I don't *want* the reduction code (e.g. combineByKey or groupByKey) to even 
know that map tasks are running at different times. It should simply request 
its range of reduce partitions once, and then the shuffle *implementation* 
should see which maps are ready and start pulling from those. Note also that 
the partition range there is for reduce partitions (e.g. our job has 100 reduce 
partitions and we ask for partitions 2-5 because we decided to have just one 
reduce task for those). It's not for map IDs.

> Pluggable interface for shuffles
> --------------------------------
>
>                 Key: SPARK-2044
>                 URL: https://issues.apache.org/jira/browse/SPARK-2044
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle, Spark Core
>            Reporter: Matei Zaharia
>            Assignee: Matei Zaharia
>         Attachments: Pluggableshuffleproposal.pdf
>
>
> Given that a lot of the current activity in Spark Core is in shuffles, I 
> wanted to propose factoring out shuffle implementations in a way that will 
> make experimentation easier. Ideally we will converge on one implementation, 
> but for a while, this could also be used to have several implementations 
> coexist. I'm suggesting this because I aware of at least three efforts to 
> look at shuffle (from Yahoo!, Intel and Databricks). Some of the things 
> people are investigating are:
> * Push-based shuffle where data moves directly from mappers to reducers
> * Sorting-based instead of hash-based shuffle, to create fewer files (helps a 
> lot with file handles and memory usage on large shuffles)
> * External spilling within a key
> * Changing the level of parallelism or even algorithm for downstream stages 
> at runtime based on statistics of the map output (this is a thing we had 
> prototyped in the Shark research project but never merged in core)
> I've attached a design doc with a proposed interface. It's not too crazy 
> because the interface between shuffles and the rest of the code is already 
> pretty narrow (just some iterators for reading data and a writer interface 
> for writing it). Bigger changes will be needed in the interaction with 
> DAGScheduler and BlockManager for some of the ideas above, but we can handle 
> those separately, and this interface will allow us to experiment with some 
> short-term stuff sooner.
> If things go well I'd also like to send a sort-based shuffle implementation 
> for 1.1, but we'll see how the timing on that works out.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to