task Tracker should be restarted if its jetty http server cannot serve 
get-map-output files
-------------------------------------------------------------------------------------------

                 Key: HADOOP-1179
                 URL: https://issues.apache.org/jira/browse/HADOOP-1179
             Project: Hadoop
          Issue Type: Bug
            Reporter: Runping Qi



Due to some errors (mem leak?), the jetty http server throws outOfMemory 
exception when serving get-map-output requests:

2007-03-28 12:28:06,642 WARN org.apache.hadoop.mapred.TaskRunner: task_0334_r_00
0379_0 Intermediate Merge of the inmemory files threw an exception: java.lang.Ou
tOfMemoryError
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:260)
        at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write
(RawLocalFileSystem.java:166)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
putStream.java:38)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65
)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(Checksum
FileSystem.java:391)
        at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
putStream.java:38)
        at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65
)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
        at java.io.DataOutputStream.write(DataOutputStream.java:90)
        at org.apache.hadoop.io.SequenceFile$CompressedBytes.writeCompressedByte
s(SequenceFile.java:492)
        at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.appendRaw(Sequ
enceFile.java:903)
        at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java:
2227)
        at org.apache.hadoop.mapred.ReduceTaskRunner$InMemFSMergeThread.run(Redu
ceTaskRunner.java:838)

In this case, the task tracker cannot send out the map outut files on that 
machine, rendering it useless.
Moreover, all the reduces depending on those map output files are just stuck 
there.
If the task tracker reports fail to the job tracker, the map/reduce job can 
recover.
If the task tracker restarted, it can continue to join the cluster as a new 
mamber.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to