[ 
https://issues.apache.org/jira/browse/HADOOP-13489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419668#comment-15419668
 ] 

Mingliang Liu commented on HADOOP-13489:
----------------------------------------

I'm not in favor of the non-blocking case change, which is the core of the 
patch. Checking the job state in {{non-blocking}} mode makes little, if any, 
sense to me. If the user runs in {{non-blocking}} (a.k.a {{-async}}) mode, she 
should not expect the job be completed just after its submission. By the way, 
the it's {{blocking}} mode by default. Instead, she, as a downstream user of 
DistCp, should call {{DistCp#execute()}} directly and probes the job state 
manually. In [MAPREDUCE-6248], [~jingzhao] enabled users to get the MR job 
information for distcp. So {{DistCp#execute()}} instead of {{DistCp#run()}} 
should be right usage of the {{non-blocking}} distcp along with probing exit 
job state, IMO.

And for the blocking case, as we agree in the above discussion, it's not a 
problem at all.

> DistCp may incorrectly return success status when the underlying Job failed
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-13489
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13489
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: HADOOP-13489.v1.patch, HADOOP-13489.v2.patch, 
> HADOOP-13489.v3.patch, TestIncrementalBackup-output.txt
>
>
> I was troubleshooting HBASE-14450 where at the end of BackupdistCp#execute(), 
> distcp job was marked unsuccessful (BackupdistCp is a wrapper of DistCp).
> Yet in IncrementalTableBackupProcedure#incrementalCopy(), the return value 
> from copyService.copy() was 0.
> Here is related code from DistCp:
> {code}
>     try {
>       execute();
>     } catch (InvalidInputException e) {
>       LOG.error("Invalid input: ", e);
>       return DistCpConstants.INVALID_ARGUMENT;
>     } catch (DuplicateFileException e) {
>       LOG.error("Duplicate files in input path: ", e);
>       return DistCpConstants.DUPLICATE_INPUT;
>     } catch (AclsNotSupportedException e) {
>       LOG.error("ACLs not supported on at least one file system: ", e);
>       return DistCpConstants.ACLS_NOT_SUPPORTED;
>     } catch (XAttrsNotSupportedException e) {
>       LOG.error("XAttrs not supported on at least one file system: ", e);
>       return DistCpConstants.XATTRS_NOT_SUPPORTED;
>     } catch (Exception e) {
>       LOG.error("Exception encountered ", e);
>       return DistCpConstants.UNKNOWN_ERROR;
>     }
>     return DistCpConstants.SUCCESS;
> {code}
> We don't check whether the Job returned by execute() was successful.
> Even if the Job fails, DistCpConstants.SUCCESS is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to