[ https://issues.apache.org/jira/browse/HADOOP-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535403#comment-14535403 ]
Zoran Dimitrijevic commented on HADOOP-1540: -------------------------------------------- #5: we were experiencing performance issues for large number of files only because of RPCs to either namenode or to s3. Filtering each file name locally using a small number of compiled regex or glob rules should not be a big deal, especially since it's optional. For example, sorting a big filelist that we do now is much more expensive. Thank you for your patch! > distcp should support an exclude list > ------------------------------------- > > Key: HADOOP-1540 > URL: https://issues.apache.org/jira/browse/HADOOP-1540 > Project: Hadoop Common > Issue Type: Improvement > Components: util > Affects Versions: 2.6.0 > Reporter: Senthil Subramanian > Assignee: Rich Haase > Priority: Minor > Labels: BB2015-05-TBR, patch > Attachments: HADOOP-1540.003.patch, HADOOP-1540.004.patch, > HADOOP-1540.005.patch, HADOOP-1540.006.patch > > > There should be a way to ignore specific paths (eg: those that have already > been copied over under the current srcPath). -- This message was sent by Atlassian JIRA (v6.3.4#6332)