[ https://issues.apache.org/jira/browse/HADOOP-15209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16400911#comment-16400911 ]
Hudson commented on HADOOP-15209: --------------------------------- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13845 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/13845/]) HADOOP-15209. DistCp to eliminate needless deletion of files under (stevel: rev 1976e0066e9ae8852715fa69d8aea3769330e933) * (edit) hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/contract/s3a/ITestS3AContractDistCp.java * (add) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/DeletedDirTracker.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/ContractTestUtils.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java * (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/util/TestDistCpUtils.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java * (add) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/contract/TestLocalContractDistCp.java * (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/TestCopyCommitter.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/util/DistCpUtils.java * (edit) hadoop-tools/hadoop-distcp/src/test/resources/log4j.properties * (add) hadoop-tools/hadoop-distcp/src/test/resources/contract/localfs.xml * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptionSwitch.java * (edit) hadoop-tools/hadoop-azure-datalake/pom.xml * (add) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/mapred/TestDeletedDirTracker.java * (add) hadoop-tools/hadoop-azure-datalake/src/test/java/org/apache/hadoop/fs/adl/live/TestAdlContractDistCpLive.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/CopyListing.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpConstants.java * (edit) hadoop-tools/hadoop-distcp/src/test/java/org/apache/hadoop/tools/contract/AbstractContractDistCpTest.java * (edit) hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/CopyListingFileStatus.java > DistCp to eliminate needless deletion of files under already-deleted > directories > -------------------------------------------------------------------------------- > > Key: HADOOP-15209 > URL: https://issues.apache.org/jira/browse/HADOOP-15209 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp > Affects Versions: 2.9.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > Fix For: 3.1.0 > > Attachments: HADOOP-15209-001.patch, HADOOP-15209-002.patch, > HADOOP-15209-003.patch, HADOOP-15209-004.patch, HADOOP-15209-005.patch, > HADOOP-15209-006.patch, HADOOP-15209-007.patch > > > DistCP issues a delete(file) request even if is underneath an already deleted > directory. This generates needless load on filesystems/object stores, and, if > the store throttles delete, can dramatically slow down the delete operation. > If the distcp delete operation can build a history of deleted directories, > then it will know when it does not need to issue those deletes. > Care is needed here to make sure that whatever structure is created does not > overload the heap of the process. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org