[ 
http://issues.apache.org/jira/browse/HADOOP-816?page=comments#action_12457852 ] 
            
Owen O'Malley commented on HADOOP-816:
--------------------------------------

Why did you introduce map.buffer.size.mb to replace io.sort.mb? Better naming 
or are you intending to use io.sort.mb as the size to buffer in the reduce?

I'm not sure that making sort adaptive to the block size is right. There are 
many reasonable configurations when your jvm heap size for the tasks is smaller 
than your block sizes.

Since the defaults are currently:
  dfs block size: 64mb
  map buffer: 100 mb

I think we are ok, although I guess it would be reasonable to print a LOG.warn 
if the blocksize was bigger than the buffer size, since that will lead to lower 
performance.

> Allow the sort benchmark to set a buffersize for the map buffer
> ---------------------------------------------------------------
>
>                 Key: HADOOP-816
>                 URL: http://issues.apache.org/jira/browse/HADOOP-816
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>         Assigned To: Devaraj Das
>         Attachments: 816.patch
>
>
> Discovered that framework merges are the hotspots where most time is spent in 
> the sort benchmark. With HADOOP-331, the Map phase could potentially do a 
> merge of the spills (this merge was not done pre-HADOOP-331), and then there 
> is one compulsory merge on each reduce. It may be good to avoid the merge in 
> the Map phase, if possible.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to