my frined suggest this way
val fil = sc.textFile("hdfs:///user/vijayc/data/test-spk.tx")
val res =fil.map(l => l.split(",")).map(l =>( l(0),l(1))).groupByKey.map(rd
=>(rd._1,rd._2.toList.distinct))
another useful function is *collect_set* in dataframe.
Thanks,
selvam R
On Tue, Aug 9, 2016
Example:
sel1 test
sel1 test
sel1 ok
sel2 ok
sel2 test
expected result:
sel1, [test,ok]
sel2,[test,ok]
How to achieve the above result using spark dataframe.
please suggest me.
--
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"