my frined suggest this way

val fil = sc.textFile("hdfs:///user/vijayc/data/test-spk.tx")

val res =fil.map(l => l.split(",")).map(l =>( l(0),l(1))).groupByKey.map(rd
=>(rd._1,rd._2.toList.distinct))


another useful function is *collect_set* in dataframe.


Thanks,

selvam R

On Tue, Aug 9, 2016 at 4:19 PM, Selvam Raman <sel...@gmail.com> wrote:

> Example:
>
> sel1 test
> sel1 test
> sel1 ok
> sel2 ok
> sel2 test
>
>
> expected result:
>
> sel1, [test,ok]
> sel2,[test,ok]
>
> How to achieve the above result using spark dataframe.
>
> please suggest me.
> --
> Selvam Raman
> "லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"
>



-- 
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"

Reply via email to