task Tracker should be restarted if its jetty http server cannot serve
get-map-output files
-------------------------------------------------------------------------------------------
Key: HADOOP-1179
URL: https://issues.apache.org/jira/browse/HADOOP-1179
Project: Hadoop
Issue Type: Bug
Reporter: Runping Qi
Due to some errors (mem leak?), the jetty http server throws outOfMemory
exception when serving get-map-output requests:
2007-03-28 12:28:06,642 WARN org.apache.hadoop.mapred.TaskRunner: task_0334_r_00
0379_0 Intermediate Merge of the inmemory files threw an exception: java.lang.Ou
tOfMemoryError
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileOutputStream.write
(RawLocalFileSystem.java:166)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
putStream.java:38)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65
)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.fs.ChecksumFileSystem$FSOutputSummer.write(Checksum
FileSystem.java:391)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOut
putStream.java:38)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65
)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.write(DataOutputStream.java:90)
at org.apache.hadoop.io.SequenceFile$CompressedBytes.writeCompressedByte
s(SequenceFile.java:492)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.appendRaw(Sequ
enceFile.java:903)
at org.apache.hadoop.io.SequenceFile$Sorter.writeFile(SequenceFile.java:
2227)
at org.apache.hadoop.mapred.ReduceTaskRunner$InMemFSMergeThread.run(Redu
ceTaskRunner.java:838)
In this case, the task tracker cannot send out the map outut files on that
machine, rendering it useless.
Moreover, all the reduces depending on those map output files are just stuck
there.
If the task tracker reports fail to the job tracker, the map/reduce job can
recover.
If the task tracker restarted, it can continue to join the cluster as a new
mamber.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.