[ 
https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727797#action_12727797
 ] 

Khaled Elmeleegy commented on MAPREDUCE-712:
--------------------------------------------

Well, top reports that ~170% (~1.7 cpus) of the time is spent at the data
node, which makes sense, as it's receiving all these writes. The rest of the
time is distributed evenly among tasks (maps), this part doesn't sound
right...too much fat.

One thing to add, when having replication factor of 3, the bottleneck shifts
to become the network, no surprise there.






> TextWritter example is CPU bound!!
> ----------------------------------
>
>                 Key: MAPREDUCE-712
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-712
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>    Affects Versions: 0.20.1, 0.21.0
>         Environment: ~200 nodes cluster
> Each node has the following configuration:
> Processors:     2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, 
> quad-core (8 CPUs)
> 4 Disks
> 16 GB RAM
> Linux 2.6
> Hadoop version: trunk
>            Reporter: Khaled Elmeleegy
>
> Running the RandomTextWritter example job ( from the examples jar) pegs the 
> machiens' CPUs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to