that RDD.cartesian has a strange behavior with cached and
uncached data. More precisely, I have a set of data that I load with
objectFile
*val data: RDD[(Int,String,Array[Double])] = sc.objectFile(data)*
Then I split it in two set depending on some criteria
*val part1 = data.filter(_._2 matches view1
:58 AM, Jaonary Rabarisoa jaon...@gmail.comwrote:
Hi all,
I notice that RDD.cartesian has a strange behavior with cached and
uncached data. More precisely, I have a set of data that I load with
objectFile
*val data: RDD[(Int,String,Array[Double])] = sc.objectFile(data)*
Then I split
I forgot to mention that I don't really use all of my data. Instead I use a
sample extracted with randomSample.
On Fri, Mar 28, 2014 at 10:58 AM, Jaonary Rabarisoa jaon...@gmail.comwrote:
Hi all,
I notice that RDD.cartesian has a strange behavior with cached and
uncached data. More
.
On Fri, Mar 28, 2014 at 10:58 AM, Jaonary Rabarisoa jaon...@gmail.com wrote:
Hi all,
I notice that RDD.cartesian has a strange behavior with cached and uncached
data. More precisely, I have a set of data that I load with objectFile
val data: RDD[(Int,String,Array[Double