[
https://issues.apache.org/jira/browse/HADOOP-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516620
]
Christian Kunz commented on HADOOP-1612:
----------------------------------------
I checked the logs of about 150 applications run with the July-25 nightly build
which incorporates HADOOP-1576, and also ignoring _${taskid} subdirectories in
the output directory:
Outcome:
1) no loss of files (looks like this was fixed by HADOOP-1576, thank you ***)
2) no movement of undesired files
but:
3) listDirectory of output directory still occasionally fails up to 10 seconds
after job completion (maybe a DFS issue?)
4) _${taskid} subdirectories not completely cleaned up even days after job
completion.
I think the issue should be reopened but not as a blocker.
> listing of an output directory shortly after job completion fails
> -----------------------------------------------------------------
>
> Key: HADOOP-1612
> URL: https://issues.apache.org/jira/browse/HADOOP-1612
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.14.0
> Reporter: Christian Kunz
> Assignee: Arun C Murthy
> Priority: Blocker
> Fix For: 0.14.0
>
>
> Sometimes, after a job finishes, and another application wants to rename dfs
> files created by that job, listing of the output directory containing the
> newly created files fails. File creation and directory listing is done via
> libhdfs, but it is unlikely that this makes any difference, therefore, I add
> this to the mapred component.
> It might be a race condition: does the job complete before the files in the
> output directory are promoted?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.