> I started a profile of the reduce-task. I've attached the profiling output. > It seems from the samples that ramManager.waitForDataToMerge() doesn't > actually wait. > Has anybody seen this behavior.
This has been fixed in HADOOP-3940 On 9/4/08 6:36 PM, "Espen Amble Kolstad" <[EMAIL PROTECTED]> wrote: > I have the same problem on our cluster. > > It seems the reducer-tasks are using all cpu, long before there's anything to > shuffle. > > I started a profile of the reduce-task. I've attached the profiling output. > It seems from the samples that ramManager.waitForDataToMerge() doesn't > actually wait. > Has anybody seen this behavior. > > Espen > > On Thursday 28 August 2008 06:11:42 wangxu wrote: >> Hi,all >> I am using hadoop-0.18.0-core.jar and nutch-2008-08-18_04-01-55.jar, >> and running hadoop on one namenode and 4 slaves. >> attached is my hadoop-site.xml, and I didn't change the file >> hadoop-default.xml >> >> when data in segments are large,this kind of errors occure: >> >> java.io.IOException: Could not obtain block: blk_-2634319951074439134_1129 >> file=/user/root/crawl_debug/segments/20080825053518/content/part-00002/data >> at >> org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.jav >> a:1462) at >> org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1 >> 312) at >> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1417) at >> java.io.DataInputStream.readFully(DataInputStream.java:178) >> at >> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:64 >> ) at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:102) >> at >> org.apache.hadoop.io.SequenceFile$Reader.readBuffer(SequenceFile.java:1646) >> at >> org.apache.hadoop.io.SequenceFile$Reader.seekToCurrentValue(SequenceFile.ja >> va:1712) at >> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java: >> 1787) at >> org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceF >> ileRecordReader.java:104) at >> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordRe >> ader.java:79) at >> org.apache.hadoop.mapred.join.WrappedRecordReader.next(WrappedRecordReader. >> java:112) at >> org.apache.hadoop.mapred.join.WrappedRecordReader.accept(WrappedRecordReade >> r.java:130) at >> org.apache.hadoop.mapred.join.CompositeRecordReader.fillJoinCollector(Compo >> siteRecordReader.java:398) at >> org.apache.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:5 >> 6) at >> org.apache.hadoop.mapred.join.JoinRecordReader.next(JoinRecordReader.java:3 >> 3) at >> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:165) >> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:45) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227) >> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209) >> >> >> how can I correct this? >> thanks. >> Xu >