I've a RDD that contains ids (Long).

subsetids

res22: org.apache.spark.rdd.RDD[Long]


I've another RDD that has an Object (MyObject) where one of the field is an
id (Long).

allobjects

res23: org.apache.spark.rdd.RDD[MyObject] = MappedRDD[272]

Now I want to run filter on allobjects so that I can get a subset that
matches with the ids that are present in my first RDD (i.e., subsetids)

Say something like -

val subsetObjs = allobjects.filter( x => subsetids.contains(x.getId) )

However, there is no method "contains" so I'm looking for the most
efficient way to achieving this in Spark.

Thanks.

Reply via email to