Hi, I have a piece of code in which the result of a groupByKey operation is as follows:
(2013-04, ArrayBuffer(s1, s2, s3, s1, s2, s4)) The first element is a String value representing a date and the ArrayBuffer consists of (non-unique) strings. I want to extract the unique elements of the ArrayBuffer. So I am expecting the result to be: (2013-04, ArrayBuffer(s1, s2, s3, s4)) I tried the following: .groupByKey .map(g => (g._1, g,_2.distinct) But I get the following runtime error: value distinct is not a member of Iterable[String] [error] .map(g=> (g._1, g._2.distinct)) I also tried g._2.distinct(), but got the same error. I looked at the Scala ArrayBuffer documentation and it supports distinct() and count() operations. I am using Spark 1.0.1 and scala 2.10.4. I would like to know how to extract the unique elements of the ArrayBuffer above. thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Extracting-unique-elements-of-an-ArrayBuffer-tp12320.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org