Github user khayyatzy commented on the pull request:
https://github.com/apache/incubator-spark/pull/587#issuecomment-34963700
Yes it is the same as you just described. But rdd.selfCartesian avoids the
extra work required.
Github user khayyatzy commented on the pull request:
https://github.com/apache/incubator-spark/pull/587#issuecomment-34959013
I am using rdd.selfCartesian for optimization purposes. I am using Spark
for large data analytic project on relational data. My application sometimes
require
GitHub user khayyatzy opened a pull request:
https://github.com/apache/incubator-spark/pull/587
Adding RDD unique self cross product
Hi,
I am using Spark in some data analysis project and I frequently requires
the unique self cross product for a single RDD. Since I am