Hi,

On Mon, Jan 26, 2015 at 9:32 AM, Steve Nunez <snu...@hortonworks.com> wrote:

>  I’ve got a list of points: List[(Float, Float)]) that represent (x,y)
> coordinate pairs and need to sum the distance. It’s easy enough to compute
> the distance:
>

Are you saying you want all combinations (N^2) of distances? That should be
possible with rdd.cartesian():

val points = sc.parallelize(List((1.0, 2.0), (3.0, 4.0), (5.0, 6.0)))
points.cartesian(points).collect
--> Array[((Double, Double), (Double, Double))] =
Array(((1.0,2.0),(1.0,2.0)), ((1.0,2.0),(3.0,4.0)), ((1.0,2.0),(5.0,6.0)),
((3.0,4.0),(1.0,2.0)), ((3.0,4.0),(3.0,4.0)), ((3.0,4.0),(5.0,6.0)),
((5.0,6.0),(1.0,2.0)), ((5.0,6.0),(3.0,4.0)), ((5.0,6.0),(5.0,6.0)))

I guess this is a very expensive operation, though.

Tobias

Reply via email to