Hi,
I would like to call a method on JavaPairRDD from Scala and I am not sure how
to write a function for the "map". I am using a third-party library that uses
Spark for geospatial computations and it happens that it returns some results
through Java API. I'd welcome a hint how to write a function for 'map' such
that JavaPairRDD is happy.
Here's a signature:
org.apache.spark.api.java.JavaPairRDD[com.vividsolutions.jts.geom.Polygon,java.util.HashSet[com.vividsolutions.jts.geom.Polygon]]
= org.apache.spark.api.java.JavaPairRDD
Normally I would write something like this:
def calculate_intersection(polygon: Polygon, hashSet: HashSet[Polygon]) = {
(polygon, hashSet.asScala.map(polygon.intersection(_).getArea))
}
javapairrdd.map(calculate_intersection)
... but it will complain that it's not a Java Function.
My first thought was to implement the interface, i.e.:
class PairRDDWrapper extends
org.apache.spark.api.java.function.Function2[Polygon, HashSet[Polygon]]
{
override def call(polygon: Polygon, hashSet: HashSet[Polygon]): (Polygon,
scala.collection.mutable.Set[Double]) = {
(polygon, hashSet.asScala.map(polygon.intersection(_).getArea))
}
}
I am not sure though how to use it, or if it makes any sense in the first
place. Should be simple, it's just my Java / Scala is "little rusty".
Cheers,
Lucas