[jira] Updated: (HADOOP-816) Allow the sort benchmark to set a buffersize for the map buffer

Devaraj Das (JIRA) Tue, 12 Dec 2006 03:06:43 -0800

     [ http://issues.apache.org/jira/browse/HADOOP-816?page=all ]


Devaraj Das updated HADOOP-816:
-------------------------------

    Attachment: 816.patch

Added code in Sort.java that will look at the dfs blocksize of a file in the 
input path, and depending on the blocksize, will infer the map buffer size and 
set a config in jobconf (this is because the sort benchmark spawns maps that 
each work on dfs blocksize amount of data)
Added a new config variable - map.buffer.size.mb in MapTask.java that will 
control the buffer size (currently the buffer size is controlled by 
io.sort.mb). Decided to make this independent since io.sort.mb has some bearing 
on the buffer sizes for the buffers created during reading/writing files; 
wanted to keep those buffer sizes separate from the map buffer size.

> Allow the sort benchmark to set a buffersize for the map buffer
> ---------------------------------------------------------------
>
>                 Key: HADOOP-816
>                 URL: http://issues.apache.org/jira/browse/HADOOP-816
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>         Assigned To: Devaraj Das
>         Attachments: 816.patch
>
>
> Discovered that framework merges are the hotspots where most time is spent in 
> the sort benchmark. With HADOOP-331, the Map phase could potentially do a 
> merge of the spills (this merge was not done pre-HADOOP-331), and then there 
> is one compulsory merge on each reduce. It may be good to avoid the merge in 
> the Map phase, if possible.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Updated: (HADOOP-816) Allow the sort benchmark to set a buffersize for the map buffer

Reply via email to