Thanks, Richard! I basically have two RDD's: A and B; and I need to compute a value for every pair of (a, b) for a in A and b in B. My first thought is cartesian, but involves expensive shuffle.
Any alternatives? How about I convert B to an array and broadcast it to every node (assuming B is relative small to fit)? On Tue, Aug 4, 2015 at 8:23 AM, Richard Marscher <rmarsc...@localytics.com> wrote: > Yes it does, in fact it's probably going to be one of the more expensive > shuffles you could trigger. > > On Mon, Aug 3, 2015 at 12:56 PM, Meihua Wu <rotationsymmetr...@gmail.com> > wrote: >> >> Does RDD.cartesian involve shuffling? >> >> Thanks! >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> > > > > -- > Richard Marscher > Software Engineer > Localytics > Localytics.com | Our Blog | Twitter | Facebook | LinkedIn --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org