[jira] Commented: (HADOOP-4396) sort on 400 nodes is now slower than in 18

Jothi Padmanabhan (JIRA) Thu, 16 Oct 2008 22:27:10 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12640433#action_12640433
 ]


Jothi Padmanabhan commented on HADOOP-4396:
-------------------------------------------

OK, this might be a non issue after all.

All my tests have been with mapred.reduce.parallel.copies=60 and 
tasktracker.http.threads=100. This does not appear to be the ideal 
configuration for the cluster, Runping let me know that he uses 
parallel.copies=30 and http.threads=50. With this configuration, sort took the 
same time as 18 and gridmix completed in 40+ minutes, which is a reasonable 
time.

When  reduce.parallel.copies=60 and tasktracker.http.threads=100, it is obvious 
that towards the end of the map phase, the load on the disks on the individual 
nodes is fairly high because the reducers are pulling in data from a lot more 
maps in parallel and possibly shuffling them to disk. This seems to be causing 
the stragglers that we observed. However, slowing down the maps by having them 
write in small chunks seems to somehow mitigate this problem as observed with 
both the LocalFileSystem and when breaking down the writes into chunks when 
using the RawLocalFileSystem.

> sort on 400 nodes is now slower than in 18
> ------------------------------------------
>
>                 Key: HADOOP-4396
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4396
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Jothi Padmanabhan
>            Assignee: Jothi Padmanabhan
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4396-v3.patch
>
>
> Sort on 400 nodes on  hadoop release 18 takes about 29 minutes, but with the 
> 19 branch takes about 32 minutes. This behavior is consistent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4396) sort on 400 nodes is now slower than in 18

Reply via email to