hi, I am a newbie of spark, the question below may seems fool, but I really
want some advices:
As load data from disk to generate an rdd is very cost in my applications, I
hope I can generate it once and cache it in memory, then any other spark
applications can refer to this rdd. Can this
Yep,
Regarding flatMap and an implicit parameter might work like in scala's
future for instance:
https://github.com/scala/scala/blob/master/src/library/scala/concurrent/Future.scala#L246
Dunno, still waiting for some insights from the team ^^
andy
On Wed, Mar 12, 2014 at 3:23 PM, Pascal Voitot
Hi,
If the join keys are skewed is there are specific optimized join available
in Spark for such usecases ?
I saw in both scalding and Hive similar feature is supported and I am
testing skewjoinWithSmaller on one of the skewed dataset...