[ 
https://issues.apache.org/jira/browse/HDFS-7312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14216375#comment-14216375
 ] 

Yongjun Zhang commented on HDFS-7312:
-------------------------------------

HI Joseph,

Sorry I have some more comments:

# Distcp is a sensitive area for change. Suggest to add some more tests that 
combine some different parameter. E.g., on top of 
testCopyDfsToDfsUpdateOverwrite(), testCopyDfsToDfsUpdateWithSkipCRC(), 
testCopyFromLocalToDfsWithStagingAreaInSrc(), 
testCopyFromLocalToDfsWithStagingAreaInDest() and testCopyDuplication(), add a 
variant test that uses -skiptmp.
# Can you describe the real cluster tests you have done?
# Suggest to add to the usage message for -skiptmp, as "NOTE 3". An example:  
"Be default, distcp copies files to temporary area then rename to destination. 
Using -skiptmp switch means that distcp works on the destination directly. 
Recommend to use it only when you really need to (such as to avoid copy/rename 
overhead in s3, where rename is not natively supported),  because it may cause 
damage to existing destination file if distcp fails for some reason."
# Remove extra empty line at TestCopyFiles.java line 389

Thanks.


> Update DistCp v1 to optionally not use tmp location
> ---------------------------------------------------
>
>                 Key: HDFS-7312
>                 URL: https://issues.apache.org/jira/browse/HDFS-7312
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 2.5.1
>            Reporter: Joseph Prosser
>            Assignee: Joseph Prosser
>            Priority: Minor
>         Attachments: HDFS-7312.001.patch, HDFS-7312.002.patch, 
> HDFS-7312.003.patch, HDFS-7312.004.patch, HDFS-7312.005.patch, HDFS-7312.patch
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> DistCp v1 currently copies files to a tmp location and then renames that to 
> the specified destination.  This can cause performance issues on filesystems 
> such as S3.  A -skiptmp flag will be added to bypass this step and copy 
> directly to the destination.  This feature mirrors a similar one added to 
> HBase ExportSnapshot 
> [HBASE-11119|https://issues.apache.org/jira/browse/HBASE-11119]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to