Hi, On Mon, Jan 26, 2015 at 9:32 AM, Steve Nunez <snu...@hortonworks.com> wrote:
> I’ve got a list of points: List[(Float, Float)]) that represent (x,y) > coordinate pairs and need to sum the distance. It’s easy enough to compute > the distance: > Are you saying you want all combinations (N^2) of distances? That should be possible with rdd.cartesian(): val points = sc.parallelize(List((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))) points.cartesian(points).collect --> Array[((Double, Double), (Double, Double))] = Array(((1.0,2.0),(1.0,2.0)), ((1.0,2.0),(3.0,4.0)), ((1.0,2.0),(5.0,6.0)), ((3.0,4.0),(1.0,2.0)), ((3.0,4.0),(3.0,4.0)), ((3.0,4.0),(5.0,6.0)), ((5.0,6.0),(1.0,2.0)), ((5.0,6.0),(3.0,4.0)), ((5.0,6.0),(5.0,6.0))) I guess this is a very expensive operation, though. Tobias