Hi, I would like to call a method on JavaPairRDD from Scala and I am not sure how to write a function for the "map". I am using a third-party library that uses Spark for geospatial computations and it happens that it returns some results through Java API. I'd welcome a hint how to write a function for 'map' such that JavaPairRDD is happy.
Here's a signature: org.apache.spark.api.java.JavaPairRDD[com.vividsolutions.jts.geom.Polygon,java.util.HashSet[com.vividsolutions.jts.geom.Polygon]] = org.apache.spark.api.java.JavaPairRDD Normally I would write something like this: def calculate_intersection(polygon: Polygon, hashSet: HashSet[Polygon]) = { (polygon, hashSet.asScala.map(polygon.intersection(_).getArea)) } javapairrdd.map(calculate_intersection) ... but it will complain that it's not a Java Function. My first thought was to implement the interface, i.e.: class PairRDDWrapper extends org.apache.spark.api.java.function.Function2[Polygon, HashSet[Polygon]] { override def call(polygon: Polygon, hashSet: HashSet[Polygon]): (Polygon, scala.collection.mutable.Set[Double]) = { (polygon, hashSet.asScala.map(polygon.intersection(_).getArea)) } } I am not sure though how to use it, or if it makes any sense in the first place. Should be simple, it's just my Java / Scala is "little rusty". Cheers, Lucas