[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990222#comment-13990222 ]
Tsz Wo Nicholas Sze commented on MAPREDUCE-5809: ------------------------------------------------ > If we do that, then we'll lose the parallelism benefit we get from doing the > RPC calls inside the MR tasks. ... You are right we'll lose the parallelism. However, we have to build the source listing anyway. If FileSystem.listStatus(..) returns also ACL, then we definitely will put the ACL in the listing SequenceFile. (Question, why listStatus(..) does not return ACL or does it make sense to add it in the future?) Now, we need an additional getAclStatus(..) call. If two clusters are close in distance, calling getAclStatus(..) in parallel probably is faster. However, if the clusters are far away (a common case), calling getAclStatus(..) from the destination cluster may take a long round trip time. It also take more bandwidth which is usually limited. Running the distcp command in the source cluster is probably better. > I chose RuntimeException for consistency with the existing exceptions like > CopyListing#DuplicateFileException and CopyListing#InvalidInputException. ... I see. Let's keep extending RuntimeException for the moment. We could change all of them later. > Enhance distcp to support preserving HDFS ACLs. > ----------------------------------------------- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp > Affects Versions: 2.4.0 > Reporter: Chris Nauroth > Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)