[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Resolution: Fixed Fix Version/s: 2.5.0 3.0.0 Status: Resolved (was: Patch Available) I committed this to trunk and branch-2. Nicholas, thank you for the code review and the excellent suggestions. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Fix For: 3.0.0, 2.5.0 > > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch, MAPREDUCE-5809.4.patch, MAPREDUCE-5809.5.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated MAPREDUCE-5809: --- Hadoop Flags: Reviewed +1 patch looks good. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch, MAPREDUCE-5809.4.patch, MAPREDUCE-5809.5.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.5.patch Here is patch v5 incorporating the latest feedback from Nicholas and also making use of the ACL bit introduced in HDFS-6326 to cut down on unnecessary {{getAclStatus}} calls. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch, MAPREDUCE-5809.4.patch, MAPREDUCE-5809.5.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.4.patch I'm attaching patch v4. This implements Nicholas's suggestion to store the ACLs in the copy listing file for use by {{CopyCommitter}}. Additional changes to this patch will come after we resolve HDFS-6326. bq. If we cannot change FileStatus for backwards-compatibility, how about add new FileSystem methods such as listStatusWithACL(..) which returns FileStatusWithACL? I'd like to have a follow-up discussion about possibly adding batch APIs to improve distcp performance, and this is one of them that I had in mind. Can we please postpone this until later, since it's a performance optimization and not a correctness issue? The current patch is already large, and adding new API definitions and the RPC calls to back them would likely double it. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch, MAPREDUCE-5809.4.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.3.patch Thanks, Akira. Here is patch version 3. I cleaned up the unused imports in {{DistCpUtils}}. However, I didn't see any unused imports in {{ScopedAclEntries}}. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch, > MAPREDUCE-5809.3.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.2.patch I'm attaching patch v2, which is just a minor rebase on current trunk. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch, MAPREDUCE-5809.2.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.1.patch I think the test failure is unrelated. I'm reattaching the exact same patch to try another Jenkins run. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: (was: MAPREDUCE-5809.1.patch) > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Attachment: MAPREDUCE-5809.1.patch The attached patch introduces the "distcp -pa" command line argument for preserving ACLs. I found that it helped the implementation to reuse some code currently in HDFS, so I've done some refactoring. If this gets +1, then I'll split it into separate patches for HADOOP and HDFS. # {{AclUtil}}: This is a new utility class containing methods that previously resided in HDFS or privately in Hadoop Common. It's annotated {{LimitedPrivate}} for HDFS and MapReduce. There is no new logic here, just moving the methods around. # {{ScopedAclEntries}}: This whole class has moved from HDFS to Common with addition of the {{LimitedPrivate}} annotation. # {{AclCommands}}: The implementation of the ls shell command is simplified by using the new utility code. # {{CopyListing}}: Checks if source file system supports ACLs before attempting to run the distcp job (fail fast). Renamed {{checkForDuplicates}} to {{validateFinalListing}} since the validation is now more than just checking for duplicates. # {{DistCp}}: Checks if target file system supports ACLs. # {{DistCpOptionSwitch}}: Documented meaning of -pa and also made it clear which attributes get preserved when passing -p with no additional flags. # {{DistCpUtils}}: Preserve ACLs if requested. # {{TestDistCpWithAcls}}: New test suite that runs distcp with -pa and asserts that ACLs and permission bits are the same at the destination. Also tests the fail-fast behavior when ACLs are not enabled in the NameNode or ACLs are unimplemented in the file system. The one thing we can't cover in our automated tests here is the case of attempting distcp with -pa where the NameNode is pre-2.4.0. I'll check that with a manual test. > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (MAPREDUCE-5809) Enhance distcp to support preserving HDFS ACLs.
[ https://issues.apache.org/jira/browse/MAPREDUCE-5809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated MAPREDUCE-5809: - Status: Patch Available (was: In Progress) > Enhance distcp to support preserving HDFS ACLs. > --- > > Key: MAPREDUCE-5809 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5809 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: distcp >Affects Versions: 2.4.0 >Reporter: Chris Nauroth >Assignee: Chris Nauroth > Attachments: MAPREDUCE-5809.1.patch > > > This issue tracks enhancing distcp to add a new command-line argument for > preserving HDFS ACLs from the source at the copy destination. -- This message was sent by Atlassian JIRA (v6.2#6252)