There's a JIRA tracking this here:
https://issues.apache.org/jira/browse/SPARK-2387
On Mon, Feb 2, 2015 at 9:48 PM, Xuelin Cao xuelincao2...@gmail.com wrote:
In hadoop MR, there is an option *mapred.reduce.slowstart.completed.maps*
which can be used to start reducer stage when X% mappers are completed. By
doing this, the data shuffling process is able to parallel with the map
process.
In a large multi-tenancy cluster, this option is usually tuned off. But, in
some cases, turn on the option could accelerate some high priority jobs.
Will spark provide similar option?