[ https://issues.apache.org/jira/browse/MAPREDUCE-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12727808#action_12727808 ]
Arun C Murthy commented on MAPREDUCE-712: ----------------------------------------- bq. I wonder how much time you were spending in gc with such small heaps. That might explain the cpu load. Agreed. You have 5G of data per map (50TB/100k maps) which results in a significant number of output Text objects being created in RandomTextWriter (a potential bug). Thus, we'd get a lot of data out of the profiles of the tasks... > TextWritter example is CPU bound!! > ---------------------------------- > > Key: MAPREDUCE-712 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-712 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: task > Affects Versions: 0.20.1, 0.21.0 > Environment: ~200 nodes cluster > Each node has the following configuration: > Processors: 2 x Xeon L5420 2.50GHz (8 cores) - Harpertown C0, 64-bit, > quad-core (8 CPUs) > 4 Disks > 16 GB RAM > Linux 2.6 > Hadoop version: trunk > Reporter: Khaled Elmeleegy > > Running the RandomTextWritter example job ( from the examples jar) pegs the > machiens' CPUs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.