Weird! This looks like some other problem which happened while merging the
outputs at the Reduce task. The copying stage went through fine. This
requires some more analysis.

> -----Original Message-----
> From: Mike Smith [mailto:[EMAIL PROTECTED]
> Sent: Thursday, March 01, 2007 3:44 AM
> To: [email protected]
> Subject: Re: some reducers stock in copying stage
> 
> Devaraj,
> 
> After applying patch 1043 the copying problem is solved. But, I am
> getting new exceptions, but, the tasks will be finished after reassigning
> to
> another tasktracker. So, the job gets done eventually. But, I never had
> this
> exception before applying this patch (or could it be because of chaning
> back-off time to 5 sec?):
> 
> java.lang.NullPointerException
> at
> org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java
> :74)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(
> ChecksumFileSystem.java:217)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(
> ChecksumFileSystem.java:163)
> at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(
> FSDataInputStream.java:41)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.readFully(DataInputStream.java:178)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.reset(
> SequenceFile.java:427)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.access$700(
> SequenceFile.java:414)
> at org.apache.hadoop.io.SequenceFile$Reader.nextRawValue(SequenceFile.java
> :1665)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawValue(
> SequenceFile.java:2579)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(
> SequenceFile.java:2351)
> at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java
> :2226)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(
> SequenceFile.java:2442)
> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2164)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:270)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1444)
> 
> java.lang.NullPointerException
> at
> org.apache.hadoop.fs.FSDataInputStream$Buffer.seek(FSDataInputStream.java
> :74)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:121)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.readBuffer(
> ChecksumFileSystem.java:217)
> at org.apache.hadoop.fs.ChecksumFileSystem$FSInputChecker.read(
> ChecksumFileSystem.java:163)
> at org.apache.hadoop.fs.FSDataInputStream$PositionCache.read(
> FSDataInputStream.java:41)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
> at java.io.DataInputStream.readFully(DataInputStream.java:178)
> at java.io.DataInputStream.readFully(DataInputStream.java:152)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.reset(
> SequenceFile.java:427)
> at org.apache.hadoop.io.SequenceFile$UncompressedBytes.access$700(
> SequenceFile.java:414)
> at org.apache.hadoop.io.SequenceFile$Reader.nextRawValue(SequenceFile.java
> :1665)
> at
> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawValue(
> SequenceFile.java:2579)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.next(
> SequenceFile.java:2351)
> at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java
> :2226)
> at org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(
> SequenceFile.java:2442)
> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2164)
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:270)
> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1444)
> 
> 
> 
> On 2/28/07, Mike Smith <[EMAIL PROTECTED]> wrote:
> >
> > Thanks Devaraj, patch 1042 seems to be already committed. Also, the
> system
> > never recovered even after 1 min, 300 sec, it stocked there for hours. I
> > will try patch 1043 and also decrease the back-off time to see if those
> help
> >
> >


Reply via email to