Hi. To be honest I don't really understand your problem declaration :( but lets just talk about how .flatmap works. Unlike .map(), that only allows a one-to-one transformation, .flatmap() allows 0, 1 or many outputs per item processed but the output must take the form of a sequence of the same type, like a /List/ for example. All the sequences will then be merged (i.e. flattened) in the end into a single RDD of that type. Note however that an array does not inherit from Sequence and thus you must transform it to a Sequence or something that inherits from AbstractSeq, like a List. See http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.List vs. http://www.scala-lang.org/api/current/index.html#scala.Array
For example, lets assume you have an RDD[(Array[Int])] and you want all the Int values flattened into a single RDD[(Int)]. The code would be something like so: val intArraysRDD : RDD[(Array[Int])] = ..."some code to get array"... val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array => { var ret : List[(Int)] = nil for ( i <- array) { ret = i :: ret } ret }) This is an intentionally explicit version.. A simpler could would be something like this .. val flattnedIntRDD : RDD[(Int)] = intArraysRDD.flatmap( array => array.toList) However, to understand exactly your problem you need to explain better what the RDD you want to create should look like.. Regards, Gylfi. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Flatten-list-tp23887p23892.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org