Some information on Hadoop Sort

2010-02-19 Thread aa225
Hello,
  I was wondering if some one could me some information on hadoop does the
sorting. From what I have read there does not seem to be a map class and reduce
class ? Where and how is the sorting parallelized ?


Best Regards from Buffalo

Abhishek Agrawal

SUNY- Buffalo
(716-435-7122)





Re: Some information on Hadoop Sort

2010-02-19 Thread Gang Luo
Hi,
the sorting is done by the MapReduce framework. At map side, the output record 
will first go to a sorting buffer where the sorting, partitioning and combining 
(if there is combiner) happen. If necessary, multi-phase sorting is done to 
make a single sorted result for each map task. At reduce side, all the data 
from multiple map tasks will be merged (each of them is sorted at the map side, 
you only need merge sort here). It goes multiple rounds if necessary.

-Gang



- 原始邮件 
发件人: "aa...@buffalo.edu" 
收件人: common-user@hadoop.apache.org
发送日期: 2010/2/19 (周五) 2:25:50 下午
主   题: Some information on Hadoop Sort

Hello,
  I was wondering if some one could me some information on hadoop does the
sorting. From what I have read there does not seem to be a map class and reduce
class ? Where and how is the sorting parallelized ?


Best Regards from Buffalo

Abhishek Agrawal

SUNY- Buffalo
(716-435-7122)


  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/


Re: Re: Some information on Hadoop Sort

2010-02-27 Thread aa225
Hi,
   Which file contains all this code ? I am looking at the file TeraSort.java 
but
this contains code for creating a trie. Similarly the file MergeSort.java
contains code for simple sequential merge sort. Where is this map reduce code ?


Best Regards from Buffalo

Abhishek Agrawal

SUNY- Buffalo
(716-435-7122)

On Fri 02/19/10  5:06 PM , Gang Luo lgpub...@yahoo.com.cn sent:
> Hi,
> the sorting is done by the MapReduce framework. At map side, the output
> record will first go to a sorting buffer where the sorting, partitioning
> and combining (if there is combiner) happen. If necessary, multi-phase
> sorting is done to make a single sorted result for each map task. At reduce
> side, all the data from multiple map tasks will be merged (each of them is
> sorted at the map side, you only need merge sort here). It goes multiple
> rounds if necessary.
> -Gang
> 
> 
> 
> - ���件 
> �件人� "aa...@buffa
> lo.edu"  lo.edu>�件人� common-u...@hadoop.apache.org����� 2010/2/19
> (��) 2:25:50 ��主   �� Some information on
> Hadoop Sort
> Hello,
> I was wondering if some one could me some information on hadoop does
> thesorting. From what I have read there does not seem to be a map class and
> reduceclass ? Where and how is the sorting parallelized ?
> 
> 
> Best Regards from Buffalo
> 
> Abhishek Agrawal
> 
> SUNY- Buffalo
> (716-435-7122)
> 
> 
> ___ 
> 好�贺�&cce
> dil;­�ä½ å��ï¼�é�®ç
> ;®±è´ºå�¡å�¨æ�°
> ;�线� http://card.mail.cn.yahoo.com/
> 
> 
> 
>