Distributed sorting using Hadoop

2011-11-26 Thread madhu_sushmi

Hi,
I need to implement distributed sorting using Hadoop. I am quite new to
Hadoop and I am getting confused. If I want to implement Merge sort, what my
Map and reduce should be doing. ? Should all the sorting happen at reduce
side? 

Please help. This is an urgent requirement. Please guide me.

Thanks,
Madhu
-- 
View this message in context: 
http://old.nabble.com/Distributed-sorting-using-Hadoop-tp32876786p32876786.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Distributed sorting using Hadoop

2011-11-26 Thread madhu_sushmi

Hi,
I need to implement distributed sorting using Hadoop. I am quite new to
Hadoop and I am getting confused. If I want to implement Merge sort, what my
Map and reduce should be doing. ? Should all the sorting happen at reduce
side? 

Please help. This is an urgent requirement. Please guide me.

-- 
View this message in context: 
http://old.nabble.com/Distributed-sorting-using-Hadoop-tp32876787p32876787.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: Distributed sorting using Hadoop

2011-11-26 Thread Prashant Sharma
Please see my mail on common-dev.

Also you may not send the same mail on all mailing lists, be patient for
people to reply.

On Sat, Nov 26, 2011 at 6:35 PM, madhu_sushmi wrote:

>
> Hi,
> I need to implement distributed sorting using Hadoop. I am quite new to
> Hadoop and I am getting confused. If I want to implement Merge sort, what
> my
> Map and reduce should be doing. ? Should all the sorting happen at reduce
> side?
>
> Please help. This is an urgent requirement. Please guide me.
>
> --
> View this message in context:
> http://old.nabble.com/Distributed-sorting-using-Hadoop-tp32876787p32876787.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>
>


Re: Distributed sorting using Hadoop

2011-11-29 Thread Chris Smith
Madhu,

Try working your way through the MapReduce tutorial here:
http://hadoop.apache.org/common/docs/r0.20.205.0/mapred_tutorial.html#Example%3A+WordCount+v1.0
 that covers most of the concepts you require to do a distributed
sort.

Search for the worf, "combiner", in the tutorial to understand about
combining results using the Mapper - to reduce cross cluster traffic.

Also work your way through several of the tutorials and videos on
working with Hadoop - Google is your friend here.

Another good source on the general algoritms is Jimmy Lin's book
referenced on this page:
http://www.umiacs.umd.edu/~jimmylin/book.html

Regards,

Chris

On 26 November 2011 13:05, madhu_sushmi  wrote:
>
> Hi,
> I need to implement distributed sorting using Hadoop. I am quite new to
> Hadoop and I am getting confused. If I want to implement Merge sort, what my
> Map and reduce should be doing. ? Should all the sorting happen at reduce
> side?
>
> Please help. This is an urgent requirement. Please guide me.
>
> --
> View this message in context: 
> http://old.nabble.com/Distributed-sorting-using-Hadoop-tp32876787p32876787.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>


Re: Distributed sorting using Hadoop

2011-11-29 Thread Alex Gauthier
worf. :)

Agree. Tutorial is complete enough to get you started.

Good luck.

On Tue, Nov 29, 2011 at 7:59 AM, Chris Smith  wrote:

> Madhu,
>
> Try working your way through the MapReduce tutorial here:
>
> http://hadoop.apache.org/common/docs/r0.20.205.0/mapred_tutorial.html#Example%3A+WordCount+v1.0
>  that covers most of the concepts you require to do a distributed
> sort.
>
> Search for the *worf*, "combiner", in the tutorial to understand about
> combining results using the Mapper - to reduce cross cluster traffic.
>
> Also work your way through several of the tutorials and videos on
> working with Hadoop - Google is your friend here.
>
> Another good source on the general algoritms is Jimmy Lin's book
> referenced on this page:
> http://www.umiacs.umd.edu/~jimmylin/book.html
>
> Regards,
>
> Chris
>
> On 26 November 2011 13:05, madhu_sushmi  wrote:
> >
> > Hi,
> > I need to implement distributed sorting using Hadoop. I am quite new to
> > Hadoop and I am getting confused. If I want to implement Merge sort,
> what my
> > Map and reduce should be doing. ? Should all the sorting happen at reduce
> > side?
> >
> > Please help. This is an urgent requirement. Please guide me.
> >
> > --
> > View this message in context:
> http://old.nabble.com/Distributed-sorting-using-Hadoop-tp32876787p32876787.html
> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
> >
>