GitHub user khayyatzy opened a pull request:
https://github.com/apache/incubator-spark/pull/587
Adding RDD unique self cross product
Hi,
I am using Spark in some data analysis project and I frequently requires
the unique self cross product for a single RDD. Since I am using Spark's Java
API, I added the new function "selfCartesian" JavaRDDLike.scala. I also modify
RDD.scala where it calls function "CartesianRDD2". "CartesianRDD2" Has similar
implementation to "CartesianRDD", where it only returns elements (a, b) if
a.index <= b.index. I have been using this Spark's modification for couple of
months and the function always return correct results
I hope this new small feature would be favorable for other Spark users.
Regards,
Zuhair Khayyat
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/incubator-spark master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-spark/pull/587.patch
----
commit 82a80ef264ad15fa706eb566691470308b30f63a
Author: Zuhair Khayyat <[email protected]>
Date: 2014-02-12T12:46:05Z
Adding unique self cross product of in a single RDD
commit fb8ad2eee1c4ce175f3cf4227492bbc9f3502db3
Author: Zuhair Khayyat <[email protected]>
Date: 2014-02-12T12:54:11Z
changing ClassManifest to ClassTag in unique self product classes
commit ea02451bb11274f55be3706ea86e21e43e54fd35
Author: Zuhair Khayyat <[email protected]>
Date: 2014-02-12T12:59:16Z
adding import scala.reflect.ClassTag to CartesianRDD2.scala
commit 8f81706f374773aeea0d608b6baa9d2164c8f364
Author: Zuhair Khayyat <[email protected]>
Date: 2014-02-12T13:29:27Z
removing unwanted text from CartesianRDD2.scala
----