Hello,

>From time to time I get the following error:

Error initializing attempt_201008101445_0212_r_000002_0:
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
any valid local directory for
taskTracker/jobcache/job_201008101445_0212/job.xml
        at 
org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:343)
        at 
org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:124)
        at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:750)
        at 
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:1664)
        at org.apache.hadoop.mapred.TaskTracker.access$1200(TaskTracker.java:97)
        at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:1629)

If I restart the job without making any changes to the cluster or the
disks it eventually works. After a while I get this error again for a
different job. Restarting the job always seems to work but this is
very annoying.

I searched online and it seems that this error is triggered when there
is not enough space on any of the disks. This is not the case for me
as each node has 200GB of free space.

Is there anything else I can check besides the free space on the disks?

Thanks!
Rares

Reply via email to