[
https://issues.apache.org/jira/browse/HADOOP-370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Pendleton updated HADOOP-370:
-----------------------------------
Attachment: fix.tasktracker.localdirs.patch.txt
Here's my currently-deployed code fixing this bug. I may not be getting to work
with Hadoop clusters much in my next position, so, unfortunately, this is as-is
with no test case. It is up-to-date and working against the 0.13.0 branch.
Without this, listing non-existent directories in mapred.local.dir will fail.
This is still a pretty severe bug.
> TaskTracker startup fails if any mapred.local.dir entries don't exist
> ---------------------------------------------------------------------
>
> Key: HADOOP-370
> URL: https://issues.apache.org/jira/browse/HADOOP-370
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Environment: ~30 node cluster, various size/number of disks, CPUs,
> memory
> Reporter: Bryan Pendleton
> Assignee: Owen O'Malley
> Attachments: fix-freespace-tasktracker-failure.txt,
> fix.tasktracker.localdirs.patch.txt
>
>
> This appears to have been introduced with the "check for enough free space"
> before startup.
> It's debatable how best to fix this bug. I will submit a patch which ignores
> directories for which the DF utility fails. This is letting me continue
> operation on my cluster (where the number of drives varies, so there are
> entries in mapred.local.dir for drives that aren't on all cluster nodes), but
> a cleaner solution is probably better. I'd lean towards "check for
> existence", and ignore the dir if it doesn't - but don't depend on DF to
> fail, since DF could fail for other reasons without meaning you're out of
> disk space. I argue that a TaskTracker should start up if *all* directories
> that *can be written to* in the list have enough space. Otherwise, a failed
> drive per cluster machine means no work ever gets done.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.