Hi, Thank you very much to all for your reply.
I am able to get it by groupByKey Here is my code : import au.com.bytecode.opencsv.CSVParser val data = sc.textFile("/data/data.csv"); def pLines(lines:Iterator[String])={ val parser=new CSVParser() lines.map(l=>{val vs=parser.parseLine(l) (vs(0),vs(1).toInt)}) } val result = data.mapPartitions(pLines).groupByKey.collect Thanks Amit On Fri, Dec 26, 2014 at 2:18 PM, Sean Owen <so...@cloudera.com> wrote: > This does not appear to be what the asker wanted as this makes one big > string. groupByKey is correct after parsing to key value pairs. > On Dec 26, 2014 3:55 AM, "Somnath Pandeya" <somnath_pand...@infosys.com> > wrote: > >> Hi , >> >> You can try reducebyKey also , >> >> Something like this >> >> JavaPairRDD<String, String> ones = lines >> >> .mapToPair(*new* *PairFunction<String, >> String, String>()* { >> >> @Override >> >> *public* Tuple2<String, String> >> call(String s) { >> >> String[] temp = s.split(","); >> >> *return* *new* Tuple2<String, >> String>(temp[0], temp[1]); >> >> } >> >> }); >> >> >> >> JavaPairRDD<String, String> *counts* = ones >> >> .reduceByKey(*new* *Function2<String, String, >> String>()* { >> >> @Override >> >> *public* String call(String i1, String >> i2) { >> >> *return* i1 + "," + i2; >> >> } >> >> }); >> >> >> >> *From:* Tobias Pfeiffer [mailto:t...@preferred.jp] >> *Sent:* Friday, December 26, 2014 6:35 AM >> *To:* Amit Behera >> *Cc:* u...@spark.incubator.apache.org >> *Subject:* Re: unable to do group by with 1st column >> >> >> >> Hi, >> >> >> >> On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera <amit.bd...@gmail.com> >> wrote: >> >> How can I do it? Please help me to do. >> >> >> >> Have you considered using groupByKey? >> >> http://spark.apache.org/docs/latest/programming-guide.html#transformations >> >> >> >> Tobias >> >> **************** CAUTION - Disclaimer ***************** >> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely >> for the use of the addressee(s). If you are not the intended recipient, >> please >> notify the sender by e-mail and delete the original message. Further, you >> are not >> to copy, disclose, or distribute this e-mail or its contents to any other >> person and >> any such actions are unlawful. This e-mail may contain viruses. Infosys has >> taken >> every reasonable precaution to minimize this risk, but is not liable for any >> damage >> you may sustain as a result of any virus in this e-mail. You should carry >> out your >> own virus checks before opening the e-mail or attachment. Infosys reserves >> the >> right to monitor and review the content of all messages sent to or from this >> e-mail >> address. Messages sent to or from this e-mail address may be stored on the >> Infosys e-mail system. >> ***INFOSYS******** End of Disclaimer ********INFOSYS*** >> >>