[
https://issues.apache.org/jira/browse/HADOOP-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Owen O'Malley resolved HADOOP-2049.
-----------------------------------
Resolution: Duplicate
Fix Version/s: 0.15.0
Assignee: Chris Douglas
This was fixed as part of HADOOP-2048.
> distcp does not fail if source directory has files with missing blocks
> ----------------------------------------------------------------------
>
> Key: HADOOP-2049
> URL: https://issues.apache.org/jira/browse/HADOOP-2049
> Project: Hadoop
> Issue Type: Bug
> Components: util
> Affects Versions: 0.15.0
> Environment: Nightly build: Oct 11, 2007.
> Reporter: Murtaza A. Basrai
> Assignee: Chris Douglas
> Priority: Critical
> Fix For: 0.15.0
>
>
> I copied a directory using distcp (to another directory on the same file
> system).
> There were 9 data blocks missing in the files in the source directory, which
> caused distcp to print messages like the following:
> ...
> 07/10/13 00:09:16 INFO mapred.JobClient: map 1% reduce 0%
> 07/10/13 00:09:16 INFO mapred.JobClient: Task Id :
> task_200710120717_0081_m_000020_0, Status : FAILED
> java.io.IOException: Could not obtain block: blk_6787282547149034655
> file=/srcdir/file1
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1136)
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:988)
> at
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1094)
> at java.io.DataInputStream.read(DataInputStream.java:83)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:348)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:216)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1753)
> ...
> The corresponding tasks failed, but the retries were successful (all files
> with missing blocks in the source directory were copied as empty files in the
> target directory).
> I think that distcp should fail if it cannot successfully copy all the files
> (at least when no command-line options are given).
> This is critical for us as we intend to use distcp to copy databases from one
> dfs to another, and if silent failures can happen then we would have to
> monitor each distcp manually to ensure that it succeeded.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.