Effects of persist(XYZ_2)

2015-02-25 Thread Marius Soutier
Hi,

just a quick question about calling persist with the _2 option. Is the 2x 
replication only useful for fault tolerance, or will it also increase job speed 
by avoiding network transfers? Assuming I’m doing joins or other shuffle 
operations.

Thanks


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Effects of persist(XYZ_2)

2015-02-25 Thread Sean Owen
If you mean, can both copies of the blocks be used for computations?
yes they can.

On Wed, Feb 25, 2015 at 10:36 AM, Marius Soutier mps@gmail.com wrote:
 Hi,

 just a quick question about calling persist with the _2 option. Is the 2x 
 replication only useful for fault tolerance, or will it also increase job 
 speed by avoiding network transfers? Assuming I’m doing joins or other 
 shuffle operations.

 Thanks


 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Effects of persist(XYZ_2)

2015-02-25 Thread Marius Soutier
Yes. Effectively, could it avoid network transfers? Or put differently, would 
an option like persist(MEMORY_ALL) improve job speed by caching an RDD on every 
worker?

 On 25.02.2015, at 11:42, Sean Owen so...@cloudera.com wrote:
 
 If you mean, can both copies of the blocks be used for computations?
 yes they can.
 
 On Wed, Feb 25, 2015 at 10:36 AM, Marius Soutier mps@gmail.com wrote:
 Hi,
 
 just a quick question about calling persist with the _2 option. Is the 2x 
 replication only useful for fault tolerance, or will it also increase job 
 speed by avoiding network transfers? Assuming I’m doing joins or other 
 shuffle operations.
 
 Thanks
 
 
 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org
 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org