The operator you’re looking for is .flatMap. It flattens all the results if you have nested lists of results (e.g. A map over a source element can return zero or more target elements) I’m not very familiar with the Java APIs but in scala it would go like this (keeping type annotations only as documentation):
def toBson(bean: ProductBean): BSONObject = { … } val customerBeans: RDD[(Long, Seq[ProductBean])] = allBeans.groupBy(_.customerId) val mongoObjects: RDD[BSONObject] = customerBeans.flatMap { case (id, beans) => beans.map(toBson) } Hope this helps, -adrian From: Shams ul Haque Date: Tuesday, October 27, 2015 at 12:50 PM To: "user@spark.apache.org<mailto:user@spark.apache.org>" Subject: Separate all values from Iterable Hi, I have grouped all my customers in JavaPairRDD<Long, Iterable<ProductBean>> by there customerId (of Long type). Means every customerId have a List or ProductBean. Now i want to save all ProductBean to DB irrespective of customerId. I got all values by using method JavaRDD<Iterable<ProductBean>> values = custGroupRDD.values(); Now i want to convert JavaRDD<Iterable<ProductBean>> to JavaRDD<Object, BSONObject> so that i can save it to Mongo. Remember, every BSONObject is made of Single ProductBean. I am not getting any idea of how to do this in Spark, i mean which Spark's Transformation is used to do that job. I think this task is some kind of seperate all values from Iterable. Please let me know how is this possible. Any hint in Scala or Python are also ok. Thanks Shams