Hi,

Thank you very much to all for your reply.

I am able to get it by groupByKey

Here is my code :

import au.com.bytecode.opencsv.CSVParser
val data = sc.textFile("/data/data.csv");
def pLines(lines:Iterator[String])={
  val parser=new CSVParser()
  lines.map(l=>{val vs=parser.parseLine(l)
                (vs(0),vs(1).toInt)})
}
val result = data.mapPartitions(pLines).groupByKey.collect

Thanks
Amit







On Fri, Dec 26, 2014 at 2:18 PM, Sean Owen <so...@cloudera.com> wrote:

> This does not appear to be what the asker wanted as this makes one big
> string. groupByKey is correct after parsing to key value pairs.
> On Dec 26, 2014 3:55 AM, "Somnath Pandeya" <somnath_pand...@infosys.com>
> wrote:
>
>>  Hi ,
>>
>> You can try reducebyKey also ,
>>
>> Something like this
>>
>> JavaPairRDD<String, String> ones = lines
>>
>>                            .mapToPair(*new* *PairFunction<String,
>> String, String>()* {
>>
>>                                   @Override
>>
>>                                   *public* Tuple2<String, String>
>> call(String s) {
>>
>>                                          String[] temp = s.split(",");
>>
>>                                          *return* *new* Tuple2<String,
>> String>(temp[0], temp[1]);
>>
>>                                   }
>>
>>                            });
>>
>>
>>
>>               JavaPairRDD<String, String> *counts* = ones
>>
>>                            .reduceByKey(*new* *Function2<String, String,
>> String>()* {
>>
>>                                   @Override
>>
>>                                   *public* String call(String i1, String
>> i2) {
>>
>>                                          *return* i1 + "," + i2;
>>
>>                                   }
>>
>>                            });
>>
>>
>>
>> *From:* Tobias Pfeiffer [mailto:t...@preferred.jp]
>> *Sent:* Friday, December 26, 2014 6:35 AM
>> *To:* Amit Behera
>> *Cc:* u...@spark.incubator.apache.org
>> *Subject:* Re: unable to do group by with 1st column
>>
>>
>>
>> Hi,
>>
>>
>>
>> On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera <amit.bd...@gmail.com>
>> wrote:
>>
>>  How can I do it? Please help me to do.
>>
>>
>>
>> Have you considered using groupByKey?
>>
>> http://spark.apache.org/docs/latest/programming-guide.html#transformations
>>
>>
>>
>> Tobias
>>
>> **************** CAUTION - Disclaimer *****************
>> This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
>> for the use of the addressee(s). If you are not the intended recipient, 
>> please
>> notify the sender by e-mail and delete the original message. Further, you 
>> are not
>> to copy, disclose, or distribute this e-mail or its contents to any other 
>> person and
>> any such actions are unlawful. This e-mail may contain viruses. Infosys has 
>> taken
>> every reasonable precaution to minimize this risk, but is not liable for any 
>> damage
>> you may sustain as a result of any virus in this e-mail. You should carry 
>> out your
>> own virus checks before opening the e-mail or attachment. Infosys reserves 
>> the
>> right to monitor and review the content of all messages sent to or from this 
>> e-mail
>> address. Messages sent to or from this e-mail address may be stored on the
>> Infosys e-mail system.
>> ***INFOSYS******** End of Disclaimer ********INFOSYS***
>>
>>

Reply via email to