import org.apache.spark.SparkContext._ import org.apache.spark.rdd.RDD import scala.reflect.ClassTag
def joinTest[K: ClassTag](rddA: RDD[(K, Int)], rddB: RDD[(K, Int)]) : RDD[(K, Int)] = { rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } } On Tue, Apr 1, 2014 at 4:55 PM, Daniel Siegmann <daniel.siegm...@velos.io>wrote: > When my tuple type includes a generic type parameter, the pair RDD > functions aren't available. Take for example the following (a join on two > RDDs, taking the sum of the values): > > def joinTest(rddA: RDD[(String, Int)], rddB: RDD[(String, Int)]) : > RDD[(String, Int)] = { > rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } > } > > That works fine, but lets say I replace the type of the key with a generic > type: > > def joinTest[K](rddA: RDD[(K, Int)], rddB: RDD[(K, Int)]) : RDD[(K, Int)] > = { > rddA.join(rddB).map { case (k, (a, b)) => (k, a+b) } > } > > This latter function gets the compiler error "value join is not a member > of org.apache.spark.rdd.RDD[(K, Int)]". > > The reason is probably obvious, but I don't have much Scala experience. > Can anyone explain what I'm doing wrong? > > -- > Daniel Siegmann, Software Developer > Velos > Accelerating Machine Learning > > 440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001 > E: daniel.siegm...@velos.io W: www.velos.io >