[ 
https://issues.apache.org/jira/browse/HADOOP-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley resolved HADOOP-2049.
-----------------------------------

       Resolution: Duplicate
    Fix Version/s: 0.15.0
         Assignee: Chris Douglas

This was fixed as part of HADOOP-2048.

> distcp does not fail if source directory has files with missing blocks
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-2049
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2049
>             Project: Hadoop
>          Issue Type: Bug
>          Components: util
>    Affects Versions: 0.15.0
>         Environment: Nightly build: Oct 11, 2007.
>            Reporter: Murtaza A. Basrai
>            Assignee: Chris Douglas
>            Priority: Critical
>             Fix For: 0.15.0
>
>
> I copied a directory using distcp (to another directory on the same file 
> system).
> There were 9 data blocks missing in the files in the source directory, which 
> caused distcp to print messages like the following:
> ...
> 07/10/13 00:09:16 INFO mapred.JobClient:  map 1% reduce 0%
> 07/10/13 00:09:16 INFO mapred.JobClient: Task Id : 
> task_200710120717_0081_m_000020_0, Status : FAILED
> java.io.IOException: Could not obtain block: blk_6787282547149034655 
> file=/srcdir/file1
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1136)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:988)
>         at 
> org.apache.hadoop.dfs.DFSClient$DFSInputStream.read(DFSClient.java:1094)
>         at java.io.DataInputStream.read(DataInputStream.java:83)
>         at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:289)
>         at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:348)
>         at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:216)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1753)
> ...
> The corresponding tasks failed, but the retries were successful (all files 
> with missing blocks in the source directory were copied as empty files in the 
> target directory).
> I think that distcp should fail if it cannot successfully copy all the files 
> (at least when no command-line options are given).
> This is critical for us as we intend to use distcp to copy databases from one 
> dfs to another, and if silent failures can happen then we would have to 
> monitor each distcp manually to ensure that it succeeded.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to