[ 
https://issues.apache.org/jira/browse/HADOOP-1540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14535309#comment-14535309
 ] 

Zoran Dimitrijevic commented on HADOOP-1540:
--------------------------------------------

1.   
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java:564
 
 Minor: extra space in the comment.

2. 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java
 Refactoring of parsing logic should have been a separate patch. This will be 
harder to cherry-pick to older branches. But since this is a good refactor 
change, and I am new to hadoop community, so it's fine with me. 

3. 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/OptionsParser.java:329
  Minor: space missing between - and 1  (-1 => - 1)

4. 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/package-info.java
Is this really part of this patch? Again, I am new to Hadoop community - so if 
it's ok to combine logically different changes it's definitely good.

5. 
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/SimpleCopyFilter.java
It would be more useful if it is matching glob expressions - matching 
substrings is a very unusual filter for file-list filtering and many users will 
be puzzled what to do. I would suggest if we extend this right now instead of 
submitting this patch as is - for example, *tmp would match filenames ending 
with tmp, and not any file that happens to contain tmp in it. Or in the 
unittest "test" filter matching /user/testing is not what I would expect.

Otherwise, looks good to me.
    

> distcp should support an exclude list
> -------------------------------------
>
>                 Key: HADOOP-1540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1540
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: util
>    Affects Versions: 2.6.0
>            Reporter: Senthil Subramanian
>            Assignee: Rich Haase
>            Priority: Minor
>              Labels: BB2015-05-RFC, patch
>         Attachments: HADOOP-1540.003.patch, HADOOP-1540.004.patch, 
> HADOOP-1540.005.patch, HADOOP-1540.006.patch
>
>
> There should be a way to ignore specific paths (eg: those that have already 
> been copied over under the current srcPath). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to