Hi , You can try reducebyKey also , Something like this JavaPairRDD<String, String> ones = lines .mapToPair(new PairFunction<String, String, String>() { @Override public Tuple2<String, String> call(String s) { String[] temp = s.split(","); return new Tuple2<String, String>(temp[0], temp[1]); } });
JavaPairRDD<String, String> counts = ones .reduceByKey(new Function2<String, String, String>() { @Override public String call(String i1, String i2) { return i1 + "," + i2; } }); From: Tobias Pfeiffer [mailto:t...@preferred.jp] Sent: Friday, December 26, 2014 6:35 AM To: Amit Behera Cc: u...@spark.incubator.apache.org Subject: Re: unable to do group by with 1st column Hi, On Fri, Dec 26, 2014 at 5:22 AM, Amit Behera <amit.bd...@gmail.com<mailto:amit.bd...@gmail.com>> wrote: How can I do it? Please help me to do. Have you considered using groupByKey? http://spark.apache.org/docs/latest/programming-guide.html#transformations Tobias **************** CAUTION - Disclaimer ***************** This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions are unlawful. This e-mail may contain viruses. Infosys has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. Infosys reserves the right to monitor and review the content of all messages sent to or from this e-mail address. Messages sent to or from this e-mail address may be stored on the Infosys e-mail system. ***INFOSYS******** End of Disclaimer ********INFOSYS***