RE: sort at reduce side

Srigurunath Chakravarthi Mon, 15 Feb 2010 02:12:41 -0800

>So, after shuffle at reduce side,  are the spills actually stored as map
>files?


Yes.

When a reducer receives map output from multiple maps worth fs.inmemorysize.mb 
in size, it sorts and spills the data to disk.

If the number of map output data files spilt to disk exceeds io.sort.factor, 
additional merge step(s) are performed to reduce the file count to lesser than 
that number.

When all map output data is spilt to disk, the reduce tasks starts invoking the 
reduce function and passes key-value pairs in sorted order by reading them off 
from the above files.

Regards,
Sriguru

>-----Original Message-----
>From: Gang Luo [mailto:lgpub...@yahoo.com.cn]
>Sent: Thursday, February 04, 2010 1:58 AM
>To: common-user@hadoop.apache.org
>Subject: Re: sort at reduce side
>
>Thanks for reply, Sriguru.
>So, after shuffle at reduce side,  are the spills actually stored as map
>files?
>
>Why I ask these questions is based on some observations as following. On a
>16 nodes cluster, when I do a map join, it takes 3 and a half minutes. When
>I do a reduce side join on nearly the same amount of data, it take 8
>minutes before map phase complete. I am sure the computation (map function)
>will not cause so much difference, the extra 4 minutes time could be only
>spent on sorting at map side for reduce side join. While I also notice that
>the sort time at reduce side is only 30 sec (I cannot access the online
>jobtracker, the 30 sec time is actually the time reduce takes from 33%
>completeness to 66% completeness).  The number of reduce tasks is much
>fewer than that of map tasks, which means each reduce task sort more data
>than each map task (I use hash partitioner and data is uniformly
>distributed).  The only reason I come up with for the big difference
>between the sort at map side and reduce side is the different behaviors of
>these two sorts.
>
>Anybody has some ideas why the map takes so much time for reduce side join
>compared to map side join, and why there is big difference between sort at
>map side and reduce side?
>
>P.S. I join a 7.5G file with a 100M file. the sort buffer at reduce is
>slightly large than that at map side.
>
>
>-Gang
>
>
>
>----- 原始邮件 ----
>发件人： Srigurunath Chakravarthi <srig...@yahoo-inc.com>
>收件人： "common-user@hadoop.apache.org" <common-user@hadoop.apache.org>
>发送日期： 2010/2/3 (周三) 12:50:08 上午
>主   题： RE: sort at reduce side
>
>Hi Gang,
>
>>kept in map file. If so, in order to efficiently sort the data, reducer
>>actually only read the index part of each spill (which is a map file) and
>>sort the keys, instead of reading whole records from disk and sort them.
>
>afaik, no. Reduces always fetches map output data and not indexes (even if
>the data is from the local node, where an index may be sufficient).
>
>Regards,
>Sriguru
>
>>-----Original Message-----
>>From: Gang Luo [mailto:lgpub...@yahoo.com.cn]
>>Sent: Wednesday, February 03, 2010 10:40 AM
>>To: common-user@hadoop.apache.org
>>Subject: sort at reduce side
>>
>>Hi all,
>>I want to know some more details about the sorting at the reduce side.
>>
>>The intermediate result generated at the map side is stored as map file
>>which actually consists of two sub-files, namely index file and data file.
>>The index file stores the keys and it could point to corresponding record
>>stored in the data file.  What I think is that when intermediate result
>>(even only part of it for each mapper) is shuffled to reducer, it is still
>>kept in map file. If so, in order to efficiently sort the data, reducer
>>actually only read the index part of each spill (which is a map file) and
>>sort the keys, instead of reading whole records from disk and sort them.
>>
>>Does reducer actually do as what I expect?
>>
>>-Gang
>>
>>
>>      ___________________________________________________________
>>  好玩贺卡等你发，邮箱贺卡全新上线！
>>http://card.mail.cn.yahoo.com/
>
>
>      ___________________________________________________________
>  好玩贺卡等你发，邮箱贺卡全新上线！
>http://card.mail.cn.yahoo.com/

RE: sort at reduce side

Reply via email to