Something like the following: val zeroValue = collection.mutable.Set[String]()
val aggredated = data.aggregateByKey (zeroValue)((set, v) => set += v, (setOne, setTwo) => setOne ++= setTwo) On Tue, Jan 5, 2016 at 2:46 PM, Gavin Yue <yue.yuany...@gmail.com> wrote: > Hey, > > For example, a table df with two columns > id name > 1 abc > 1 bdf > 2 ab > 2 cd > > I want to group by the id and concat the string into array of string. like > this > > id > 1 [abc,bdf] > 2 [ab, cd] > > How could I achieve this in dataframe? I stuck on df.groupBy("id"). ??? > > Thanks > >