It can be set in an individual application.

Consolidation had some issues on ext3 as mentioned there, though we might 
enable it by default in the future because other optimizations now made it 
perform on par with the non-consolidation version. It also had some bugs in 
0.9.0 so I’d suggest at least 0.9.1.

Matei

On May 29, 2014, at 2:21 PM, Nathan Kronenfeld <nkronenf...@oculusinfo.com> 
wrote:

> Thanks, I missed that.
> 
> One thing that's still unclear to me, even looking at that, is - does this 
> parameter have to be set when starting up the cluster, on each of the 
> workers, or can it be set by an individual client job?
> 
> 
> On Fri, May 23, 2014 at 10:13 AM, Han JU <ju.han.fe...@gmail.com> wrote:
> Hi Nathan,
> 
> There's some explanation in the spark configuration section:
> 
> ```
> If set to "true", consolidates intermediate files created during a shuffle. 
> Creating fewer files can improve filesystem performance for shuffles with 
> large numbers of reduce tasks. It is recommended to set this to "true" when 
> using ext4 or xfs filesystems. On ext3, this option might degrade performance 
> on machines with many (>8) cores due to filesystem limitations.
> ```
> 
> 
> 2014-05-23 16:00 GMT+02:00 Nathan Kronenfeld <nkronenf...@oculusinfo.com>:
> 
> In trying to sort some largish datasets, we came across the 
> spark.shuffle.consolidateFiles property, and I found in the source code that 
> it is set, by default, to false, with a note to default it to true when the 
> feature is stable.
> 
> Does anyone know what is unstable about this? If we set it true, what 
> problems should we anticipate?
> 
> Thanks,
>             -Nathan Kronenfeld
> 
> 
> -- 
> Nathan Kronenfeld
> Senior Visualization Developer
> Oculus Info Inc
> 2 Berkeley Street, Suite 600,
> Toronto, Ontario M5A 4J5
> Phone:  +1-416-203-3003 x 238
> Email:  nkronenf...@oculusinfo.com
> 
> 
> 
> -- 
> JU Han
> 
> Data Engineer @ Botify.com
> 
> +33 0619608888
> 
> 
> 
> -- 
> Nathan Kronenfeld
> Senior Visualization Developer
> Oculus Info Inc
> 2 Berkeley Street, Suite 600,
> Toronto, Ontario M5A 4J5
> Phone:  +1-416-203-3003 x 238
> Email:  nkronenf...@oculusinfo.com

Reply via email to