The tasktracker require "intermediate" space while performing the map and reduce functions. Many smaller files are produced during the map and reduce processes that are deleted when the processes finish. If you are using the DFS then more disk space is required then is actually used since disk space is grabbed in blocks.

Dennis

[EMAIL PROTECTED] wrote:
I'm using nutch v.0.8 and have 3 computers.
One of my tasktrakers always go down. This occurs during indexing (index crawl/indexes). On server with crashed
tasktracker now available 53G of free disk space and used only 11G.
How i can decide this problem? Why tasktarcker requires so much free space
on HDD?

Piece of Log with error:

060613 151840 task_0083_r_000001_0 0.5% reduce > sort
060613 151841 task_0083_r_000001_0 0.5% reduce > sort
060613 151842 task_0083_r_000001_0 0.5% reduce > sort
060613 151843 task_0083_r_000001_0 0.5% reduce > sort
060613 151844 task_0083_r_000001_0 0.5% reduce > sort
060613 151845 task_0083_r_000001_0 0.5% reduce > sort
060613 151846 task_0083_r_000001_0 0.5% reduce > sort
060613 151847 task_0083_r_000001_0 0.5% reduce > sort
060613 151847 SEVERE FSError, exiting: java.io.IOException: No space left on
device
060613 151847 task_0083_r_000001_0  SEVERE FSError from child
060613 151847 task_0083_r_000001_0 org.apache.hadoop.fs.FSError:
java.io.IOException: No space left on device
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(LocalFile
Syst
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.fs.FSDataOutputStream$Summer.write(FSDataOutputStream.java
:69)
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStre
am.j
060613 151847 task_0083_r_000001_0      at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
060613 151847 task_0083_r_000001_0      at
java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
060613 151847 task_0083_r_000001_0      at
java.io.DataOutputStream.flush(DataOutputStream.java:106)
060613 151847 task_0083_r_000001_0      at
java.io.FilterOutputStream.close(FilterOutputStream.java:140)
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.io.SequenceFile$Sorter$SortPass.close(SequenceFile.java:59
8)
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.io.SequenceFile$Sorter.sortPass(SequenceFile.java:533)
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.io.SequenceFile$Sorter.sort(SequenceFile.java:519)
060613 151847 task_0083_r_000001_0      at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:316)
060613 151847 task_0083_r_000001_0      060613 151847 task_0083_r_000001_0
at org.apache.hadoop.mapred.TaskTracker$Chi
060613 151847 task_0083_r_000001_0 Caused by: java.io.IOException: No space
left on device
060613 151847 task_0083_r_000001_0      at
java.io.FileOutputStream.writeBytes(Native Method)
060613 151847 task_0083_r_000001_0      at
java.io.FileOutputStream.write(FileOutputStream.java:260)
060613 151848 task_0083_r_000001_0      at
org.apache.hadoop.fs.LocalFileSystem$LocalFSFileOutputStream.write(LocalFile
Syst
060613 151848 task_0083_r_000001_0      ... 11 more
060613 151849 Server connection on port 50050 from 10.0.0.3: exiting
060613 151854 task_0083_m_000001_0 done; removing files.
060613 151855 task_0083_m_000003_0 done; removing files.



Reply via email to