[jira] [Commented] (HDFS-1605) Convert DFSInputStream synchronized sections to a ReadWrite lock
[ https://issues.apache.org/jira/browse/HDFS-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837462#comment-13837462 ] Liang Xie commented on HDFS-1605: - we observed this hotspot in our production cluster these days. Most of the waiting lock threads jsut like below: 19205.637:IPC Server handler 27 on 12600 daemon prio=10 tid=0x7f82fc1e5750 nid=0x4d9b waiting for monitor entry [0x7f821fe78000] 19205.637- java.lang.Thread.State: BLOCKED (on object monitor) 19205.637- at org.apache.hadoop.hdfs.DFSInputStream.getFileLength(DFSInputStream.java:242) 19205.637- - waiting to lock 0x00044e20d238 (a org.apache.hadoop.hdfs.DFSInputStream) 19205.637- at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:982) 19205.637- at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:73) 19205.637- at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1393) and the lock holder is doing the right read things that time. Convert DFSInputStream synchronized sections to a ReadWrite lock Key: HDFS-1605 URL: https://issues.apache.org/jira/browse/HDFS-1605 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: DFSClientRWlock.1.txt, DFSClientRWlock.3.txt Hbase does concurrent preads from multiple threads to different blocks of the same hdfs file. Each of these pread calls invoke DFSInputStream.getFileLength() and DFSInputStream.getBlockAt(). These methods are synchronized, thus causing all the concurrent threads to serialize. It would help performance to convert this to a Read/Write lock -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5536) Implement HTTP policy for Namenode and DataNode
[ https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837461#comment-13837461 ] Hadoop QA commented on HDFS-5536: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616702/HDFS-5536.007.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestHttpsFileSystem {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5620//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5620//console This message is automatically generated. Implement HTTP policy for Namenode and DataNode --- Key: HDFS-5536 URL: https://issues.apache.org/jira/browse/HDFS-5536 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, HDFS-5536.005.patch, HDFS-5536.006.patch, HDFS-5536.007.patch, HDFS-5536.008.patch this jira implements the http and https policy in the namenode and the datanode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize
[ https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837477#comment-13837477 ] Vinay commented on HDFS-2882: - Ok Colin, Is anyone else could review the patch. ? Thanks. DN continues to start up, even if block pool fails to initialize Key: HDFS-2882 URL: https://issues.apache.org/jira/browse/HDFS-2882 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Vinay Attachments: HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, HDFS-2882.patch, hdfs-2882.txt I started a DN on a machine that was completely out of space on one of its drives. I saw the following: 2012-02-02 09:56:50,499 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id DS-507718931-172.29.5.194-11072-12978 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021 java.io.IOException: Mkdirs failed to create /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp at org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335) but the DN continued to run, spewing NPEs when it tried to do block reports, etc. This was on the HDFS-1623 branch but may affect trunk as well. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages
[ https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837478#comment-13837478 ] Vinay commented on HDFS-3405: - Hi all, Could someone take a look at changes..? Thanks in advance. Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages Key: HDFS-3405 URL: https://issues.apache.org/jira/browse/HDFS-3405 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha Reporter: Aaron T. Myers Assignee: Vinay Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch As Todd points out in [this comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986], the current scheme for a checkpointing daemon to upload a merged fsimage file to an NN is to issue an HTTP get request to tell the target NN to issue another GET request back to the checkpointing daemon to retrieve the merged fsimage file. There's no fundamental reason the checkpointing daemon can't just use an HTTP POST or PUT to send back the merged fsimage file, rather than the double-GET scheme. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5570) Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs
[ https://issues.apache.org/jira/browse/HDFS-5570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837485#comment-13837485 ] Hadoop QA commented on HDFS-5570: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616637/HDFS-5570.000.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 10 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-tools/hadoop-distcp hadoop-tools/hadoop-extras: org.apache.hadoop.mapreduce.lib.input.TestFixedLengthInputFormat org.apache.hadoop.mapred.TestFixedLengthInputFormat org.apache.hadoop.mapreduce.security.TestJHSSecurity The following test timeouts occurred in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-hs hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient hadoop-tools/hadoop-distcp hadoop-tools/hadoop-extras: org.apache.hadoop.tools.TestDelegationTokenFetcher {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5619//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5619//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5619//console This message is automatically generated. Deprecate hftp / hsftp and replace them with webhdfs / swebhdfs --- Key: HDFS-5570 URL: https://issues.apache.org/jira/browse/HDFS-5570 Project: Hadoop HDFS Issue Type: Bug Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5570.000.patch Currently hftp / hsftp only provide a strict subset of functionality that webhdfs / swebhdfs offer. Notably, hftp / hsftp do not support writes and HA namenodes. Maintaining two piece of code with similar functionality introduce unnecessary work. Webhdfs has been around since Hadoop 1.0 therefore moving forward with webhdfs does not seem to cause any significant migration issues. This jira proposes to deprecate hftp / hsftp in branch-2 and remove them in trunk. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
Vinay created HDFS-5592: --- Summary: DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Reporter: Vinay Assignee: Vinay Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
[ https://issues.apache.org/jira/browse/HDFS-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HDFS-5592: Attachment: HDFS-5592.patch Attached the patch DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.2.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-5592.patch Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
[ https://issues.apache.org/jira/browse/HDFS-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HDFS-5592: Affects Version/s: 3.0.0 2.2.0 Status: Patch Available (was: Open) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0, 3.0.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-5592.patch Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
[ https://issues.apache.org/jira/browse/HDFS-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HDFS-5592: Affects Version/s: (was: 2.2.0) 2.3.0 DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-5592.patch Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5536) Implement HTTP policy for Namenode and DataNode
[ https://issues.apache.org/jira/browse/HDFS-5536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837528#comment-13837528 ] Hadoop QA commented on HDFS-5536: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616705/HDFS-5536.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.web.TestHttpsFileSystem The test build failed in hadoop-common-project/hadoop-common {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5621//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5621//console This message is automatically generated. Implement HTTP policy for Namenode and DataNode --- Key: HDFS-5536 URL: https://issues.apache.org/jira/browse/HDFS-5536 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Attachments: HDFS-5536.000.patch, HDFS-5536.001.patch, HDFS-5536.002.patch, HDFS-5536.003.patch, HDFS-5536.004.patch, HDFS-5536.005.patch, HDFS-5536.006.patch, HDFS-5536.007.patch, HDFS-5536.008.patch this jira implements the http and https policy in the namenode and the datanode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837577#comment-13837577 ] Hudson commented on HDFS-5581: -- FAILURE: Integrated in Hadoop-Yarn-trunk #410 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/410/]) move HDFS-5581 to 2.3 (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547094) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5581. NameNodeFsck should use only one instance of BlockPlacementPolicy (vinay via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java NameNodeFsck should use only one instance of BlockPlacementPolicy - Key: HDFS-5581 URL: https://issues.apache.org/jira/browse/HDFS-5581 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinay Assignee: Vinay Fix For: 2.4.0 Attachments: HDFS-5581.patch, HDFS-5581.patch While going through NameNodeFsck I found that following code creates the new instance of BlockPlacementPolicy for every block. {code} // verify block placement policy BlockPlacementStatus blockPlacementStatus = BlockPlacementPolicy.getInstance(conf, null, networktopology). verifyBlockPlacement(path, lBlk, targetFileReplication);{code} It would be better to use the namenode's BPP itself instead of creating a new one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
[ https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837576#comment-13837576 ] Hudson commented on HDFS-5557: -- FAILURE: Integrated in Hadoop-Yarn-trunk #410 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/410/]) HDFS-5557. Write pipeline recovery for the last packet in the block may cause rejection of valid replicas. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547173) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java Write pipeline recovery for the last packet in the block may cause rejection of valid replicas -- Key: HDFS-5557 URL: https://issues.apache.org/jira/browse/HDFS-5557 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch When a block is reported from a data node while the block is under construction (i.e. not committed or completed), BlockManager calls BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported replica state. But BlockManager is calling it with the stored block, not reported block. This causes the recorded replicas' gen stamp to be that of BlockInfoUnderConstruction itself, not the one from reported replica. When a pipeline recovery is done for the last packet of a block, the incremental block reports with the new gen stamp may come before the client calling updatePipeline(). If this happens, these replicas will be incorrectly recorded with the old gen stamp and get removed later. The result is close or addAdditionalBlock failure. If the last block is completed, but the penultimate block is not because of this issue, the file won't be closed. If this file is not cleared, but the client goes away, the lease manager will try to recover the lease/block, at which point it will crash. I will file a separate jira for this shortly. The worst case is to reject all good ones and accepting a bad one. In this case, the block will get completed, but the data cannot be read until the next full block report containing one of the valid replicas is received. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5560) Trash configuration log statements prints incorrect units
[ https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837579#comment-13837579 ] Hudson commented on HDFS-5560: -- FAILURE: Integrated in Hadoop-Yarn-trunk #410 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/410/]) HDFS-5560. Trash configuration log statements prints incorrect units. Contributed by Josh Elser. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547266) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java Trash configuration log statements prints incorrect units - Key: HDFS-5560 URL: https://issues.apache.org/jira/browse/HDFS-5560 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.3.0 Attachments: HDFS-5560.patch I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the message printed out on the console: {noformat} $ hdfs dfs -expunge 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 180 minutes, Emptier interval = 0 minutes. {noformat} The configuration for both the deletion interval and emptier interval are given in minutes, converted to milliseconds and then logged as milliseconds but with a label of minutes. It looks like this was introduced in HDFS-4903. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
[ https://issues.apache.org/jira/browse/HDFS-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837585#comment-13837585 ] Uma Maheswara Rao G commented on HDFS-5592: --- +1 DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-5592.patch Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
[ https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837598#comment-13837598 ] Hudson commented on HDFS-5557: -- FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #809 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/809/]) svn merge -c 1547173 merging from trunk to branch-0.23 to fix: HDFS-5557. Write pipeline recovery for the last packet in the block may cause rejection of valid replicas. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547181) * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java Write pipeline recovery for the last packet in the block may cause rejection of valid replicas -- Key: HDFS-5557 URL: https://issues.apache.org/jira/browse/HDFS-5557 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch When a block is reported from a data node while the block is under construction (i.e. not committed or completed), BlockManager calls BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported replica state. But BlockManager is calling it with the stored block, not reported block. This causes the recorded replicas' gen stamp to be that of BlockInfoUnderConstruction itself, not the one from reported replica. When a pipeline recovery is done for the last packet of a block, the incremental block reports with the new gen stamp may come before the client calling updatePipeline(). If this happens, these replicas will be incorrectly recorded with the old gen stamp and get removed later. The result is close or addAdditionalBlock failure. If the last block is completed, but the penultimate block is not because of this issue, the file won't be closed. If this file is not cleared, but the client goes away, the lease manager will try to recover the lease/block, at which point it will crash. I will file a separate jira for this shortly. The worst case is to reject all good ones and accepting a bad one. In this case, the block will get completed, but the data cannot be read until the next full block report containing one of the valid replicas is received. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.
[ https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837599#comment-13837599 ] Hudson commented on HDFS-5558: -- FAILURE: Integrated in Hadoop-Hdfs-0.23-Build #809 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/809/]) HDFS-5558. LeaseManager monitor thread can crash if the last block is complete but another block is not. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547197) * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java LeaseManager monitor thread can crash if the last block is complete but another block is not. - Key: HDFS-5558 URL: https://issues.apache.org/jira/browse/HDFS-5558 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, HDFS-5558.patch, HDFS-5558.patch As mentioned in HDFS-5557, if a file has its last and penultimate block not completed and the file is being closed, the last block may be completed but the penultimate one might not. If this condition lasts long and the file is abandoned, LeaseManager will try to recover the lease and the block. But {{internalReleaseLease()}} will fail with invalid cast exception with this kind of file. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5592) DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file.
[ https://issues.apache.org/jira/browse/HDFS-5592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837630#comment-13837630 ] Hadoop QA commented on HDFS-5592: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616735/HDFS-5592.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5622//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5622//console This message is automatically generated. DIR* completeFile: /file is closed by DFSClient_ should be logged only for successful closure of the file. Key: HDFS-5592 URL: https://issues.apache.org/jira/browse/HDFS-5592 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.3.0 Reporter: Vinay Assignee: Vinay Attachments: HDFS-5592.patch Following log message in {{FSNameSystem#completeFile(..)}} should be logged only if the file is closed. {code}getEditLog().logSync(); NameNode.stateChangeLog.info(DIR* completeFile: + src + is closed by + holder); return success;{code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5560) Trash configuration log statements prints incorrect units
[ https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837676#comment-13837676 ] Hudson commented on HDFS-5560: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1627 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1627/]) HDFS-5560. Trash configuration log statements prints incorrect units. Contributed by Josh Elser. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547266) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java Trash configuration log statements prints incorrect units - Key: HDFS-5560 URL: https://issues.apache.org/jira/browse/HDFS-5560 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.3.0 Attachments: HDFS-5560.patch I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the message printed out on the console: {noformat} $ hdfs dfs -expunge 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 180 minutes, Emptier interval = 0 minutes. {noformat} The configuration for both the deletion interval and emptier interval are given in minutes, converted to milliseconds and then logged as milliseconds but with a label of minutes. It looks like this was introduced in HDFS-4903. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837675#comment-13837675 ] Hudson commented on HDFS-5581: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1627 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1627/]) move HDFS-5581 to 2.3 (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547094) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5581. NameNodeFsck should use only one instance of BlockPlacementPolicy (vinay via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java NameNodeFsck should use only one instance of BlockPlacementPolicy - Key: HDFS-5581 URL: https://issues.apache.org/jira/browse/HDFS-5581 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinay Assignee: Vinay Fix For: 2.4.0 Attachments: HDFS-5581.patch, HDFS-5581.patch While going through NameNodeFsck I found that following code creates the new instance of BlockPlacementPolicy for every block. {code} // verify block placement policy BlockPlacementStatus blockPlacementStatus = BlockPlacementPolicy.getInstance(conf, null, networktopology). verifyBlockPlacement(path, lBlk, targetFileReplication);{code} It would be better to use the namenode's BPP itself instead of creating a new one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
[ https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837674#comment-13837674 ] Hudson commented on HDFS-5557: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1627 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1627/]) HDFS-5557. Write pipeline recovery for the last packet in the block may cause rejection of valid replicas. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547173) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java Write pipeline recovery for the last packet in the block may cause rejection of valid replicas -- Key: HDFS-5557 URL: https://issues.apache.org/jira/browse/HDFS-5557 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch When a block is reported from a data node while the block is under construction (i.e. not committed or completed), BlockManager calls BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported replica state. But BlockManager is calling it with the stored block, not reported block. This causes the recorded replicas' gen stamp to be that of BlockInfoUnderConstruction itself, not the one from reported replica. When a pipeline recovery is done for the last packet of a block, the incremental block reports with the new gen stamp may come before the client calling updatePipeline(). If this happens, these replicas will be incorrectly recorded with the old gen stamp and get removed later. The result is close or addAdditionalBlock failure. If the last block is completed, but the penultimate block is not because of this issue, the file won't be closed. If this file is not cleared, but the client goes away, the lease manager will try to recover the lease/block, at which point it will crash. I will file a separate jira for this shortly. The worst case is to reject all good ones and accepting a bad one. In this case, the block will get completed, but the data cannot be read until the next full block report containing one of the valid replicas is received. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5581) NameNodeFsck should use only one instance of BlockPlacementPolicy
[ https://issues.apache.org/jira/browse/HDFS-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837689#comment-13837689 ] Hudson commented on HDFS-5581: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1601 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1601/]) move HDFS-5581 to 2.3 (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547094) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HDFS-5581. NameNodeFsck should use only one instance of BlockPlacementPolicy (vinay via cmccabe) (cmccabe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547088) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NamenodeFsck.java NameNodeFsck should use only one instance of BlockPlacementPolicy - Key: HDFS-5581 URL: https://issues.apache.org/jira/browse/HDFS-5581 Project: Hadoop HDFS Issue Type: Bug Components: namenode Reporter: Vinay Assignee: Vinay Fix For: 2.4.0 Attachments: HDFS-5581.patch, HDFS-5581.patch While going through NameNodeFsck I found that following code creates the new instance of BlockPlacementPolicy for every block. {code} // verify block placement policy BlockPlacementStatus blockPlacementStatus = BlockPlacementPolicy.getInstance(conf, null, networktopology). verifyBlockPlacement(path, lBlk, targetFileReplication);{code} It would be better to use the namenode's BPP itself instead of creating a new one. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5557) Write pipeline recovery for the last packet in the block may cause rejection of valid replicas
[ https://issues.apache.org/jira/browse/HDFS-5557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837688#comment-13837688 ] Hudson commented on HDFS-5557: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1601 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1601/]) HDFS-5557. Write pipeline recovery for the last packet in the block may cause rejection of valid replicas. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547173) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfoUnderConstruction.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestClientProtocolForPipelineRecovery.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java Write pipeline recovery for the last packet in the block may cause rejection of valid replicas -- Key: HDFS-5557 URL: https://issues.apache.org/jira/browse/HDFS-5557 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch, HDFS-5557.patch When a block is reported from a data node while the block is under construction (i.e. not committed or completed), BlockManager calls BlockInfoUnderConstruction.addReplicaIfNotPresent() to update the reported replica state. But BlockManager is calling it with the stored block, not reported block. This causes the recorded replicas' gen stamp to be that of BlockInfoUnderConstruction itself, not the one from reported replica. When a pipeline recovery is done for the last packet of a block, the incremental block reports with the new gen stamp may come before the client calling updatePipeline(). If this happens, these replicas will be incorrectly recorded with the old gen stamp and get removed later. The result is close or addAdditionalBlock failure. If the last block is completed, but the penultimate block is not because of this issue, the file won't be closed. If this file is not cleared, but the client goes away, the lease manager will try to recover the lease/block, at which point it will crash. I will file a separate jira for this shortly. The worst case is to reject all good ones and accepting a bad one. In this case, the block will get completed, but the data cannot be read until the next full block report containing one of the valid replicas is received. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5560) Trash configuration log statements prints incorrect units
[ https://issues.apache.org/jira/browse/HDFS-5560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837690#comment-13837690 ] Hudson commented on HDFS-5560: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1601 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1601/]) HDFS-5560. Trash configuration log statements prints incorrect units. Contributed by Josh Elser. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547266) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java Trash configuration log statements prints incorrect units - Key: HDFS-5560 URL: https://issues.apache.org/jira/browse/HDFS-5560 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Josh Elser Assignee: Josh Elser Fix For: 2.3.0 Attachments: HDFS-5560.patch I ran `hdfs dfs -expunge` on a 2.2.0 system, and noticed the following the message printed out on the console: {noformat} $ hdfs dfs -expunge 13/11/23 22:12:17 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 180 minutes, Emptier interval = 0 minutes. {noformat} The configuration for both the deletion interval and emptier interval are given in minutes, converted to milliseconds and then logged as milliseconds but with a label of minutes. It looks like this was introduced in HDFS-4903. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.
[ https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-5558: - Resolution: Fixed Fix Version/s: 0.23.10 2.4.0 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the reviews. I've committed this to trunk, branch-2 and branch-0.23. LeaseManager monitor thread can crash if the last block is complete but another block is not. - Key: HDFS-5558 URL: https://issues.apache.org/jira/browse/HDFS-5558 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, HDFS-5558.patch, HDFS-5558.patch As mentioned in HDFS-5557, if a file has its last and penultimate block not completed and the file is being closed, the last block may be completed but the penultimate one might not. If this condition lasts long and the file is abandoned, LeaseManager will try to recover the lease and the block. But {{internalReleaseLease()}} will fail with invalid cast exception with this kind of file. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5558) LeaseManager monitor thread can crash if the last block is complete but another block is not.
[ https://issues.apache.org/jira/browse/HDFS-5558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837730#comment-13837730 ] Hudson commented on HDFS-5558: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4819 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4819/]) HDFS-5558. LeaseManager monitor thread can crash if the last block is complete but another block is not. Contributed by Kihwal Lee. (kihwal: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1547393) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java LeaseManager monitor thread can crash if the last block is complete but another block is not. - Key: HDFS-5558 URL: https://issues.apache.org/jira/browse/HDFS-5558 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.9, 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Fix For: 3.0.0, 2.4.0, 0.23.10 Attachments: HDFS-5558.branch-023.patch, HDFS-5558.branch-023.patch, HDFS-5558.patch, HDFS-5558.patch As mentioned in HDFS-5557, if a file has its last and penultimate block not completed and the file is being closed, the last block may be completed but the penultimate one might not. If this condition lasts long and the file is abandoned, LeaseManager will try to recover the lease and the block. But {{internalReleaseLease()}} will fail with invalid cast exception with this kind of file. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Resolved] (HDFS-5484) StorageType and State in DatanodeStorageInfo in NameNode is not accurate
[ https://issues.apache.org/jira/browse/HDFS-5484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal resolved HDFS-5484. - Resolution: Fixed Fix Version/s: Heterogeneous Storage (HDFS-2832) Hadoop Flags: Reviewed +1 for the updated patch. I committed it to branch HDFS-2832. I agree that the test for this will take some work but we will need it once we start exposing Storage Types to applications. I will make a note in the test plan. Thanks Eric! StorageType and State in DatanodeStorageInfo in NameNode is not accurate Key: HDFS-5484 URL: https://issues.apache.org/jira/browse/HDFS-5484 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Eric Sirianni Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5484-HDFS-2832--2.patch, HDFS-5484-HDFS-2832.patch The fields in DatanodeStorageInfo are updated from two distinct paths: # block reports # storage reports (via heartbeats) The {{state}} and {{storageType}} fields are updated via the Block Report. However, as seen in the code blow, these fields are populated from a dummy {{DatanodeStorage}} object constructed in the DataNode: {code} BPServiceActor.blockReport() { //... // Dummy DatanodeStorage object just for sending the block report. DatanodeStorage dnStorage = new DatanodeStorage(storageID); //... } {code} The net effect is that the {{state}} and {{storageType}} fields are always the default of {{NORMAL}} and {{DISK}} in the NameNode. The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from: {code} public MapString, BlockListAsLongs getBlockReports(String bpid); {code} to: {code} public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid); {code} thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} object with the block report. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5453) Support fine grain locking in FSNamesystem
[ https://issues.apache.org/jira/browse/HDFS-5453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837875#comment-13837875 ] Daryn Sharp commented on HDFS-5453: --- [~sureshms] Sorry for the delay in response, I've been on vacation. The initial simplistic implementation, w/o any fsdir lock changes, and with direct fsn access (no RPC), has a modest throughput improvement of ~2-15% depending on various read/write workloads of only listStatus/mkdir/delete with an ideal scenario of low path contention in the namesystem. In practice other subsystems unnecessarily write locking the namesystem will probably negate most gains until they are addressed too. A gain is achieved if handler threads have passed through the fsn lock(s), resolved their path, checked permissions, and are blocked on the fsdir lock - as opposed to all read/write handlers being blocked on the global fsn lock during any write op. I don't have the numbers handy, but with complete removal of the fsdir lock (not yet feasible due the non-thread safe datastructures it protects) and desync of a few other methods such as UGI.getCurrentUser produced a multiplier of throughput. At the moment, I only intend to lay the groundwork for larger changes. Support fine grain locking in FSNamesystem -- Key: HDFS-5453 URL: https://issues.apache.org/jira/browse/HDFS-5453 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp The namesystem currently uses a course grain lock to control access. This prevents concurrent writers in different branches of the tree, and prevents readers from accessing branches that writers aren't using. Features that introduce latency to namesystem operations, such as cold storage of inodes, will need fine grain locking to avoid degrading the entire namesystem's throughput. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-2832: Attachment: 20131203-HeterogeneousStorage-TestPlan.pdf Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837894#comment-13837894 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616810/20131203-HeterogeneousStorage-TestPlan.pdf against trunk revision . {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5623//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arpit Agarwal updated HDFS-2832: Attachment: h2832_20131203.patch Updated merge patch to resolve recent conflicts in trunk. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837952#comment-13837952 ] Daryn Sharp commented on HDFS-5514: --- Yes, I meant coarse, nice catch. :) I think I can quickly explain the basic approach w/o a couple page doc, if not, let me know and I'll write something up. The reason for creating a separate class is to move towards providing a pluggable lock context for ops to use. The default will be coarse as it is today to not destabilize the NN. The general principle is changing the pattern of: {code} readLock(); try { ... } finally { readUnlock(); } {code} to something more like: {code} LockContext lockContext = fsLock.getLockContext(LockState.READ); try { ... lockContext.readLock(path); lockContext.writeLock(path); lockContext.writeLockParent(path); ... } finally { lockContext.unlock(); } {code} Use of the context is optional. The existing fsn readLock/writeLock methods will continue to exist to avoid changing anything but {{FSNamesystem}}. This also means not every fsn method needs to be converted immediately. For a coarse lock context, the initial lock state is applied to the global rw lock now wrapped by the {{FSNamesystemLock}} in this patch. The path locking methods are no-ops. Existing fsn readLock/writeLock methods will use the same global lock - which is how I localize the changes to the fsn. For the initial finer grain lock context, which I'll post in yet another future subtask, the context's initial lock state is effectively always a read lock on the global lock to prevent safemode/HA transitions. The path locking methods associate unique rw locks with inodes on a demand basis (the NN won't create or maintain a lock for every single inode). These locks are tracked in the context. However, these are only implementation details of a specific context for finer grain locking. Other implementations may do something completely different. I hope to use a lock-free scoreboard in the future to control/schedule which handlers are allowed to execute - but again just an implementation detail. The initial prototype I'm hoping to implement via the parent jira will not require lock changes to classes other than {{FSNamesystem}} as described above. I just need the {{FSNameSystemLock}} abstraction in this jira, and the addition of the aforementioned path locking apis I'll post in another jira. FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access
[ https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837959#comment-13837959 ] Adam Faris commented on HDFS-5569: -- I'll attempt to answer the outstanding questions. Please let me know if I missed anything. Haohui Mai: What is authorization? Authorization is a function that specifies access rights to resources. http://en.wikipedia.org/wiki/Authorization For example, I have a US passport that has my name and photo and is not expired. I walk into a bank and hand my passport to the bank teller who looks at the photo, recognizes my face, verifies the date and watermarks. Everything checks out and I am now authenticated to the bank. I now ask the teller to withdrawal $100 million from Doug Cuttings bank account. The bank teller checks to see if I have access and says I am not authorized to make the withdrawal. Kerberos only provides authentication, not authorization. In this example my passport is the TGT and the bank teller is WebHDFS. WebHDFS needs to have better authorization built into it. How about using a transparent proxy? Using nginx or traffic server is an interesting idea but it's not a good solution. One needs to deploy the proxy sw and configs to all nodes. Then how would the url mappings work? One asks the namenode for file locations and the 307 response would point to the wrong port on the datanode. What about troubleshooting GSSAPI errors? Is it the client or the proxy? Having personally supported the CDN at Yahoo! I know first hand the issues of trouble shooting web applications that use proxies. Alejandro Abdelnur: Reverse DNS lookup penalties? Assuming we are filtering by hostname and not IP networks, reverse dns lookups being a blocker for this request is where we will have to agree to disagree. While theoretically it's true, in practice one more DNS query is not going to make a difference to a individual datanode. Even if attempting to DDOS the cluster with client connections, there will be other problems before reverse lookup resolution becomes the blocker. Why not use HttpFS/Hoop? I'm unable to find references to HttpFS/Hoop in the 1.2.1 (stable) source tree, so it appears to be a 2.x feature? If HttpFS/Hoop is compatible with hadoop 1.2.x, it's going to have the above mentioned proxy issues. Troubleshooting client requests are going to be more complicated, configuring and deployment is going to be more complicated as we now have to securely manage tomcat. Using a proxy comes with a lot of overhead and is not a good solution for this request. Alejandro's comment on using tomcat to support my request is almost spot on. But instead of tomcat supporting the access control feature, it should be jetty as jetty offers the ability to block by source IP and is already included with Hadoop. This is why I opened this JIRA, WebHDFS needs to be updated to offer the ability of blocking or granting access by IP. Thanks. WebHDFS should support a deny/allow list for data access Key: HDFS-5569 URL: https://issues.apache.org/jira/browse/HDFS-5569 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Adam Faris Labels: features Currently we can't restrict what networks are allowed to transfer data using WebHDFS. Obviously we can use firewalls to block ports, but this can be complicated and problematic to maintain. Additionally, because all the jetty servlets run inside the same container, blocking access to jetty to prevent WebHDFS transfers also blocks the other servlets running inside that same jetty container. I am requesting a deny/allow feature be added to WebHDFS. This is already done with the Apache HTTPD server, and is what I'd like to see the deny/allow list modeled after. Thanks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837962#comment-13837962 ] Daryn Sharp commented on HDFS-5514: --- [~cnauroth] Regarding #2, that's technically it's a pre-existing gap in test coverage if not already there. Actually, those two methods were recently added by Kihwal's getContentSummary change and don't appear to be used by anything else. I'll probably discard those methods when I adapt the content summary to work with a lock context so there's likely not much value in adding transient tests but I will certainly add them if you like. FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837980#comment-13837980 ] Chris Nauroth commented on HDFS-5514: - Thanks, Daryn. On #2, I think verifying that the lock implementation supports reentrancy is the more significant thing than the counts returned from {{getReadHoldCount}} and {{getWriteHoldCount}}. (i.e. Locking twice in a row is expected to succeed with no exception thrown.) I'll defer to you on whether or not you think that's a valuable test here. If not, then I think the only remaining thing is the {{coarseLock}} rename. FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13837991#comment-13837991 ] Daryn Sharp commented on HDFS-5514: --- I completely agree that reentrancy testing is critical for a new lock implementation. In this jira, there is no change in existing functionality other than the fsn lock becomes a new class that delegates to a lock. Although, the patch has gone stale and I need to update it to add the hold count methods to the fsnl. FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-5514: -- Attachment: HDFS-5514.patch I stand corrected, the patch already delegated those methods. I made the following changes per feedback: # Corrected misspelling of coarse # Enhanced hold count tests for reentrancy since they were simple to do - and I forgot I my tests are already using the hold count! # Removed unnecessary import in test file FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch, HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5514: Hadoop Flags: Reviewed +1 pending Jenkins. Thanks, Daryn! FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch, HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-3405) Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages
[ https://issues.apache.org/jira/browse/HDFS-3405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838014#comment-13838014 ] Haohui Mai commented on HDFS-3405: -- I'll take a more detailed look this week. After a quick skim, it seems that you're using Apache http client to make the put request -- this won't work if it is going through HTTPS channels, since it does not load the certificates. The recommended way is to open the connection through URLConnectionFactory and to change the connConfigurator. Here is the sketch: {code} class PutConnConfigurator implements ConnectionConfigurator { private final ConnectionConfigurator prev; private HttpURLConnection configure(HttpURLConnection conn) { prev.configure(prev); conn.setRequestMethod(PUT); return conn; } PutConnConfigurator(...) { } } ConnConfigurator putConf = new PutConnConfigurator(factory.getConnConfigurator()); URLConnectionFactory newFactory = new URLConnectionFactory(putConf); URLConnection conn = newFactory.openConnection(...); {code} Checkpointing should use HTTP POST or PUT instead of GET-GET to send merged fsimages Key: HDFS-3405 URL: https://issues.apache.org/jira/browse/HDFS-3405 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.0.0, 3.0.0, 2.0.5-alpha Reporter: Aaron T. Myers Assignee: Vinay Attachments: HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch, HDFS-3405.patch As Todd points out in [this comment|https://issues.apache.org/jira/browse/HDFS-3404?focusedCommentId=13272986page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13272986], the current scheme for a checkpointing daemon to upload a merged fsimage file to an NN is to issue an HTTP get request to tell the target NN to issue another GET request back to the checkpointing daemon to retrieve the merged fsimage file. There's no fundamental reason the checkpointing daemon can't just use an HTTP POST or PUT to send back the merged fsimage file, rather than the double-GET scheme. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4983) Numeric usernames do not work with WebHDFS FS
[ https://issues.apache.org/jira/browse/HDFS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838019#comment-13838019 ] Andrew Wang commented on HDFS-4983: --- Hey Yongjun, thanks for providing the patch. A few review comments, mostly nitty stuff, otherwise looks good. It's worth checking out http://blog.cloudera.com/blog/2013/05/how-to-configure-eclipse-for-hadoop-contributions/ to get the eclipse auto-formatter if you aren't using it yet, then these things are mostly done for you: {code} //set user pattern based on configuration file {code} Usually we put a space between the slashes and the comment. I see the comment below also lacks a space, you could add one there too if you like. {code} public static final String USER_PATTERN_KEY = webhdfs.user.provider.user.pattern; /** Default user name pattern value */ public static final String USER_PATTERN_DEFAULT = ^[A-Za-z_][A-Za-z0-9._-]*[$]?$; {code} We normally double indent wrapped lines. In the new proxy user test, I think we can chop out the not-superuser and permission stuff copied from the other test. Basically, doing any WebHDFS operation with a numeric proxy user should suffice (but please do verify!). Numeric usernames do not work with WebHDFS FS - Key: HDFS-4983 URL: https://issues.apache.org/jira/browse/HDFS-4983 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Yongjun Zhang Labels: patch Attachments: HDFS-4983.001.patch Per the file hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/UserParam.java, the DOMAIN pattern is set to: {{^[A-Za-z_][A-Za-z0-9._-]*[$]?$}}. Given this, using a username such as 123 seems to fail for some reason (tried on insecure setup): {code} [123@host-1 ~]$ whoami 123 [123@host-1 ~]$ hadoop fs -fs webhdfs://host-2.domain.com -ls / -ls: Invalid value: 123 does not belong to the domain ^[A-Za-z_][A-Za-z0-9._-]*[$]?$ Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [path ...] {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5582) hdfs getconf -excludeFile or -includeFile always failed
[ https://issues.apache.org/jira/browse/HDFS-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-5582: -- Assignee: sathish hdfs getconf -excludeFile or -includeFile always failed --- Key: HDFS-5582 URL: https://issues.apache.org/jira/browse/HDFS-5582 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta, 2.2.0 Reporter: Henry Hung Assignee: sathish Priority: Minor Attachments: 1-HDFS-5582.patch In hadoop-2.2.0, if you execute getconf for exclude and include file, it will return this error message: {code} [hadoop@fphd1 hadoop-2.2.0]$ bin/hdfs getconf -excludeFile Configuration DFSConfigKeys.DFS_HOSTS_EXCLUDE is missing. [hadoop@fphd1 hadoop-2.2.0]$ bin/hdfs getconf -includeFile Configuration DFSConfigKeys.DFS_HOSTS is missing. {code} I found out the root cause is very simple, it’s because the source code of {{org/apache/hadoop/hdfs/tools/GetConf.java}} hard coded it to {{DFSConfigKeys.DFS_HOSTS}} and {{DFSConfigKeys.DFS_HOSTS_EXCLUDE}} {code} map.put(INCLUDE_FILE.getName().toLowerCase(), new CommandHandler(DFSConfigKeys.DFS_HOSTS)); map.put(EXCLUDE_FILE.getName().toLowerCase(), new CommandHandler(DFSConfigKeys.DFS_HOSTS_EXCLUDE)); {code} A simple fix would be to remove the quote: {code} map.put(INCLUDE_FILE.getName().toLowerCase(), new CommandHandler(DFSConfigKeys.DFS_HOSTS)); map.put(EXCLUDE_FILE.getName().toLowerCase(), new CommandHandler(DFSConfigKeys.DFS_HOSTS_EXCLUDE)); {code} -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-4997) libhdfs doesn't return correct error codes in most cases
[ https://issues.apache.org/jira/browse/HDFS-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-4997: -- Target Version/s: (was: ) Release Note: libhdfs now returns correct codes in errno. Previously, due to a bug, many functions set errno to 255 instead of the more specific error code. Hadoop Flags: Incompatible change,Reviewed libhdfs doesn't return correct error codes in most cases Key: HDFS-4997 URL: https://issues.apache.org/jira/browse/HDFS-4997 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-4997.001.patch libhdfs has some code to translate Java exceptions into C error codes. Unfortunately, the exceptions are returned to us in dotted format, but the code is expecting them to be in slash-separated format. This results in most exceptions just leading to a generic error code. We should fix this and add a unit test to ensure this continues to work. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5594) FileSystem API for ACLs.
Chris Nauroth created HDFS-5594: --- Summary: FileSystem API for ACLs. Key: HDFS-5594 URL: https://issues.apache.org/jira/browse/HDFS-5594 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Chris Nauroth Add new methods to {{FileSystem}} and {{FileContext}} for manipulating ACLs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5599) DistributedFileSystem: add support for recursive flag in ACL methods.
Chris Nauroth created HDFS-5599: --- Summary: DistributedFileSystem: add support for recursive flag in ACL methods. Key: HDFS-5599 URL: https://issues.apache.org/jira/browse/HDFS-5599 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Reporter: Chris Nauroth Implement and test handling of recursive flag for all ACL methods in {{DistributedFileSystem}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5598) DistributedFileSystem: implement removeDefaultAcl.
Chris Nauroth created HDFS-5598: --- Summary: DistributedFileSystem: implement removeDefaultAcl. Key: HDFS-5598 URL: https://issues.apache.org/jira/browse/HDFS-5598 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Reporter: Chris Nauroth Implement and test {{removeDefaultAcl}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5595) NameNode: implement AclManager as abstraction over INode ACL Map.
Chris Nauroth created HDFS-5595: --- Summary: NameNode: implement AclManager as abstraction over INode ACL Map. Key: HDFS-5595 URL: https://issues.apache.org/jira/browse/HDFS-5595 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Complete an initial implementation of {{AclManager}} to enable further development tasks. This will be a basic implementation using the INode ACL Map to track associations between inodes and ACLs. This will not fully implement all of the optimizations discussed in the design doc. Further optimization work will be tracked in separate tasks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5597) DistributedFileSystem: implement modifyAclEntries, removeAclEntries and removeAcl.
Chris Nauroth created HDFS-5597: --- Summary: DistributedFileSystem: implement modifyAclEntries, removeAclEntries and removeAcl. Key: HDFS-5597 URL: https://issues.apache.org/jira/browse/HDFS-5597 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Reporter: Chris Nauroth Implement and test {{modifyAclEntries}}, {{removeAclEntries}} and {{removeAcl}} in {{DistributedFileSystem}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5596) DistributedFileSystem: implement getAcls and setAcl.
Chris Nauroth created HDFS-5596: --- Summary: DistributedFileSystem: implement getAcls and setAcl. Key: HDFS-5596 URL: https://issues.apache.org/jira/browse/HDFS-5596 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Reporter: Chris Nauroth Implement and test {{getAcls}} and {{setAcl}} in {{DistributedFileSystem}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5602) FsShell CLI: add setfacl flag for removal of default ACL entries.
Chris Nauroth created HDFS-5602: --- Summary: FsShell CLI: add setfacl flag for removal of default ACL entries. Key: HDFS-5602 URL: https://issues.apache.org/jira/browse/HDFS-5602 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Chris Nauroth Implement and test setfacl support for removal of just the default entries in an ACL. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5606) libHDFS: implement hdfsRemoveDefaultAcl.
Chris Nauroth created HDFS-5606: --- Summary: libHDFS: implement hdfsRemoveDefaultAcl. Key: HDFS-5606 URL: https://issues.apache.org/jira/browse/HDFS-5606 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Reporter: Chris Nauroth Implement and test {{hdfsRemoveDefaultAcl}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5601) FsShell CLI: add setfacl flags for ACL entry modification and removal.
Chris Nauroth created HDFS-5601: --- Summary: FsShell CLI: add setfacl flags for ACL entry modification and removal. Key: HDFS-5601 URL: https://issues.apache.org/jira/browse/HDFS-5601 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Chris Nauroth Implement and test setfacl support for flags that allow partial modification of an ACL and modification of specific ACL entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5605) libHDFS: implement hdfsModifyAclEntries, hdfsRemoveAclEntries and hdfsRemoveAcl.
Chris Nauroth created HDFS-5605: --- Summary: libHDFS: implement hdfsModifyAclEntries, hdfsRemoveAclEntries and hdfsRemoveAcl. Key: HDFS-5605 URL: https://issues.apache.org/jira/browse/HDFS-5605 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Reporter: Chris Nauroth Implement and test {{hdfsModifyAclEntries}} and {{hdfsRemoveAclEntries}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5604) libHDFS: implement hdfsGetAcls and hdfsSetAcl.
Chris Nauroth created HDFS-5604: --- Summary: libHDFS: implement hdfsGetAcls and hdfsSetAcl. Key: HDFS-5604 URL: https://issues.apache.org/jira/browse/HDFS-5604 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Reporter: Chris Nauroth Implement and test {{hdfsGetAcls}} and {{hdfsSetAcl}} in libHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5603) FsShell CLI: add support for recursive flag in ACL commands.
Chris Nauroth created HDFS-5603: --- Summary: FsShell CLI: add support for recursive flag in ACL commands. Key: HDFS-5603 URL: https://issues.apache.org/jira/browse/HDFS-5603 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Chris Nauroth Implement and test handling of recursive flag for getfacl and setfacl. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5600) FsShell CLI: add getfacl and setfacl with minimal support for getting and setting ACLs.
Chris Nauroth created HDFS-5600: --- Summary: FsShell CLI: add getfacl and setfacl with minimal support for getting and setting ACLs. Key: HDFS-5600 URL: https://issues.apache.org/jira/browse/HDFS-5600 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Reporter: Chris Nauroth Implement and test FsShell CLI commands for getfacl and setfacl. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5611) WebHDFS: add support for recursive flag in ACL operations.
Chris Nauroth created HDFS-5611: --- Summary: WebHDFS: add support for recursive flag in ACL operations. Key: HDFS-5611 URL: https://issues.apache.org/jira/browse/HDFS-5611 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Reporter: Chris Nauroth Implement and test handling of recursive flag for all ACL operations in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5607) libHDFS: add support for recursive flag in ACL functions.
Chris Nauroth created HDFS-5607: --- Summary: libHDFS: add support for recursive flag in ACL functions. Key: HDFS-5607 URL: https://issues.apache.org/jira/browse/HDFS-5607 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Reporter: Chris Nauroth Implement and test handling of recursive flag for all ACL functions in libHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5610) WebHDFS: implement REMOVEDEFAULTACL.
Chris Nauroth created HDFS-5610: --- Summary: WebHDFS: implement REMOVEDEFAULTACL. Key: HDFS-5610 URL: https://issues.apache.org/jira/browse/HDFS-5610 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Reporter: Chris Nauroth Implement and test {{REMOVEDEFAULTACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5612) NameNode: change all permission checks to enforce ACLs in addition to permissions.
Chris Nauroth created HDFS-5612: --- Summary: NameNode: change all permission checks to enforce ACLs in addition to permissions. Key: HDFS-5612 URL: https://issues.apache.org/jira/browse/HDFS-5612 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth All {{NameNode}} code paths that enforce permissions must be updated so that they also enforce ACLs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5614) NameNode: implement handling of ACLs in combination with snapshots.
Chris Nauroth created HDFS-5614: --- Summary: NameNode: implement handling of ACLs in combination with snapshots. Key: HDFS-5614 URL: https://issues.apache.org/jira/browse/HDFS-5614 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Within a snapshot, all ACLs are frozen at the moment that the snapshot was created. ACL changes in the parent of the snapshot are not applied to the snapshot. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5613) NameNode: implement handling of ACLs in combination with symlinks.
Chris Nauroth created HDFS-5613: --- Summary: NameNode: implement handling of ACLs in combination with symlinks. Key: HDFS-5613 URL: https://issues.apache.org/jira/browse/HDFS-5613 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth A symlink in HDFS does not have an ACL of its own. Operations that modify the ACL of a symlink instead modify the target of the symlink. For operations that enforce ACLs, enforcement is on the target of the symlink. This is similar to existing handling of permissions for symlinks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5609) WebHDFS: implement MODIFYACLENTRIES, REMOVEACLENTRIES and REMOVEACL.
Chris Nauroth created HDFS-5609: --- Summary: WebHDFS: implement MODIFYACLENTRIES, REMOVEACLENTRIES and REMOVEACL. Key: HDFS-5609 URL: https://issues.apache.org/jira/browse/HDFS-5609 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Reporter: Chris Nauroth Implement and test {{MODIFYACLENTRIES}}, {{REMOVEACLENTRIES}} and {{REMOVEACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5608) WebHDFS: implement GETACLS and SETACL.
Chris Nauroth created HDFS-5608: --- Summary: WebHDFS: implement GETACLS and SETACL. Key: HDFS-5608 URL: https://issues.apache.org/jira/browse/HDFS-5608 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Reporter: Chris Nauroth Implement and test {{GETACLS}} and {{SETACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5621) NameNode: add indicator in web UI file system browser if a file has an ACL.
Chris Nauroth created HDFS-5621: --- Summary: NameNode: add indicator in web UI file system browser if a file has an ACL. Key: HDFS-5621 URL: https://issues.apache.org/jira/browse/HDFS-5621 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Change the file system browser to append the '+' character to permissions of any file or directory that has an ACL. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5615) NameNode: implement handling of ACLs in combination with sticky bit.
Chris Nauroth created HDFS-5615: --- Summary: NameNode: implement handling of ACLs in combination with sticky bit. Key: HDFS-5615 URL: https://issues.apache.org/jira/browse/HDFS-5615 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth The sticky bit must work in combination with ACLs, similar to how the sticky bit already works with permissions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5622) NameNode: change startup progress to track loading INode ACL Map.
Chris Nauroth created HDFS-5622: --- Summary: NameNode: change startup progress to track loading INode ACL Map. Key: HDFS-5622 URL: https://issues.apache.org/jira/browse/HDFS-5622 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Define a new startup progress {{StepType}} for loading INode ACL Map entries and use it to track progress during {{Phase#LOADING_FSIMAGE}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5619) NameNode: record ACL modifications to edit log.
Chris Nauroth created HDFS-5619: --- Summary: NameNode: record ACL modifications to edit log. Key: HDFS-5619 URL: https://issues.apache.org/jira/browse/HDFS-5619 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Implement a new edit log opcode, {{OP_SET_ACL}}, which fully replaces the ACL of a specific inode. For ACL operations that perform partial modification of the ACL, the NameNode must merge the modifications with the existing ACL to produce the final resulting ACL and encode it into an {{OP_SET_ACL}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5618) NameNode: persist ACLs in fsimage.
Chris Nauroth created HDFS-5618: --- Summary: NameNode: persist ACLs in fsimage. Key: HDFS-5618 URL: https://issues.apache.org/jira/browse/HDFS-5618 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Store ACLs in fsimage so that ACLs are retained across NameNode restarts. This requires encoding and saving the {{AclManager}} state as a new section of the fsimage, located after all existing sections (snapshot manager state, inodes, secret manager state, and cache manager state). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5617) NameNode: enforce maximum number of ACL entries.
Chris Nauroth created HDFS-5617: --- Summary: NameNode: enforce maximum number of ACL entries. Key: HDFS-5617 URL: https://issues.apache.org/jira/browse/HDFS-5617 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth The number of entries in a single ACL must be capped at 32. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5620) NameNode: enhance AclManager to use Global ACL Set as a space optimization.
Chris Nauroth created HDFS-5620: --- Summary: NameNode: enhance AclManager to use Global ACL Set as a space optimization. Key: HDFS-5620 URL: https://issues.apache.org/jira/browse/HDFS-5620 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth The {{AclManager}} can maintain a Global ACL Set to store all distinct ACLs in use by the file system. All inodes that have the same ACL entries can share the same ACL instance. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5616) NameNode: implement default ACL handling.
Chris Nauroth created HDFS-5616: --- Summary: NameNode: implement default ACL handling. Key: HDFS-5616 URL: https://issues.apache.org/jira/browse/HDFS-5616 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Implement and test handling of default ACLs within NameNode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5623) NameNode: add tests for skipping ACL enforcement when permission checks are disabled, user is superuser or user is member of supergroup.
Chris Nauroth created HDFS-5623: --- Summary: NameNode: add tests for skipping ACL enforcement when permission checks are disabled, user is superuser or user is member of supergroup. Key: HDFS-5623 URL: https://issues.apache.org/jira/browse/HDFS-5623 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth The existing permission checks are skipped under the following conditions: * {{dfs.permissions.enabled}} is set to false. (There are several exceptions stated in the documentation.) * The user is the super-user. * The user is a member of the super-user group. Add tests verifying that ACL enforcement is also skipped for all of these cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5624) Add tests for ACLs in combination with viewfs.
Chris Nauroth created HDFS-5624: --- Summary: Add tests for ACLs in combination with viewfs. Key: HDFS-5624 URL: https://issues.apache.org/jira/browse/HDFS-5624 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Chris Nauroth Add tests verifying that in a federated deployment, a viewfs wrapped over multiple federated NameNodes will dispatch the ACL operations to the correct NameNode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Created] (HDFS-5625) Write end user documentation for HDFS ACLs.
Chris Nauroth created HDFS-5625: --- Summary: Write end user documentation for HDFS ACLs. Key: HDFS-5625 URL: https://issues.apache.org/jira/browse/HDFS-5625 Project: Hadoop HDFS Issue Type: Sub-task Components: documentation Reporter: Chris Nauroth * Update File System Shell documentation to describe getfacl and setfacl. * Update HDFS Permissions Guide to cover ACLs. Hyperlink to shell documentation for getfacl and setfacl. * If there is a large amount of new content, considering splitting it to a separate HDFS ACLs Guide and hyperlink as appropriate. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-4685) Implementation of ACLs in HDFS
[ https://issues.apache.org/jira/browse/HDFS-4685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838092#comment-13838092 ] Chris Nauroth commented on HDFS-4685: - I've entered the sub-task break-down. My own initial focus is going to be HDFS-5594 and HDFS-5595. These two are pre-requisites for a lot of the other sub-tasks. After that, it will be easier for multiple people to work in parallel across the various interfaces and NameNode implementation details. Implementation of ACLs in HDFS -- Key: HDFS-4685 URL: https://issues.apache.org/jira/browse/HDFS-4685 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs-client, namenode, security Affects Versions: 1.1.2 Reporter: Sachin Jose Assignee: Chris Nauroth Attachments: HDFS-ACLs-Design-1.pdf Currenly hdfs doesn't support Extended file ACL. In unix extended ACL can be achieved using getfacl and setfacl utilities. Is there anybody working on this feature ? -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HDFS-5594) FileSystem API for ACLs.
[ https://issues.apache.org/jira/browse/HDFS-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth reassigned HDFS-5594: --- Assignee: Chris Nauroth FileSystem API for ACLs. Key: HDFS-5594 URL: https://issues.apache.org/jira/browse/HDFS-5594 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Reporter: Chris Nauroth Assignee: Chris Nauroth Add new methods to {{FileSystem}} and {{FileContext}} for manipulating ACLs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Assigned] (HDFS-5595) NameNode: implement AclManager as abstraction over INode ACL Map.
[ https://issues.apache.org/jira/browse/HDFS-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth reassigned HDFS-5595: --- Assignee: Chris Nauroth NameNode: implement AclManager as abstraction over INode ACL Map. - Key: HDFS-5595 URL: https://issues.apache.org/jira/browse/HDFS-5595 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Chris Nauroth Assignee: Chris Nauroth Complete an initial implementation of {{AclManager}} to enable further development tasks. This will be a basic implementation using the INode ACL Map to track associations between inodes and ACLs. This will not fully implement all of the optimizations discussed in the design doc. Further optimization work will be tracked in separate tasks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-2832) Enable support for heterogeneous storages in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838097#comment-13838097 ] Hadoop QA commented on HDFS-2832: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616817/h2832_20131203.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 48 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5624//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5624//console This message is automatically generated. Enable support for heterogeneous storages in HDFS - Key: HDFS-2832 URL: https://issues.apache.org/jira/browse/HDFS-2832 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas Attachments: 20130813-HeterogeneousStorage.pdf, 20131125-HeterogeneousStorage-TestPlan.pdf, 20131125-HeterogeneousStorage.pdf, 20131202-HeterogeneousStorage-TestPlan.pdf, 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, editsStored, h2832_20131023.patch, h2832_20131023b.patch, h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, h2832_20131110.patch, h2832_20131110b.patch, h2832_2013.patch, h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, h2832_20131203.patch HDFS currently supports configuration where storages are a list of directories. Typically each of these directories correspond to a volume with its own file system. All these directories are homogeneous and therefore identified as a single storage at the namenode. I propose, change to the current model where Datanode * is a * storage, to Datanode * is a collection * of strorages. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5569) WebHDFS should support a deny/allow list for data access
[ https://issues.apache.org/jira/browse/HDFS-5569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838105#comment-13838105 ] Colin Patrick McCabe commented on HDFS-5569: bq. Colin Patrick McCabe I'm not sure how adding a new IP address would by-pass this? As the client I can add whatever IP address I want but if it's not routable it won't work... You can set your IP address to be the same as someone else on the network. It may cause conflicts, but if your intent is to break system security, you probably don't care. WebHDFS should support a deny/allow list for data access Key: HDFS-5569 URL: https://issues.apache.org/jira/browse/HDFS-5569 Project: Hadoop HDFS Issue Type: Improvement Components: webhdfs Reporter: Adam Faris Labels: features Currently we can't restrict what networks are allowed to transfer data using WebHDFS. Obviously we can use firewalls to block ports, but this can be complicated and problematic to maintain. Additionally, because all the jetty servlets run inside the same container, blocking access to jetty to prevent WebHDFS transfers also blocks the other servlets running inside that same jetty container. I am requesting a deny/allow feature be added to WebHDFS. This is already done with the Apache HTTPD server, and is what I'd like to see the deny/allow list modeled after. Thanks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid
[ https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838159#comment-13838159 ] Colin Patrick McCabe commented on HDFS-5182: So, previously we discussed a few different ways for the {{DataNode}} to notify the {{DFSClient}} about a change in the block's mlock status. One way (let's call this choice #1) was using a shared memory segment. This would take the form of a third file descriptor passed from the {{DataNode}} to the {{DFSClient}}. On Linux, this would simply be a 4kb file from the {{/dev/shm}} filesystem, which is a {{tmpfs}} filesystem. That filesystem is the best choice because it will not cause the file to be written to memory every {{dirty_centisecs}}. However, on looking into this further, I found some issues with this method. There is no way for the {{DataNode}} to know when the {{DFSClient}} has closed the file descriptor for the shared memory area. We could add some kind of protocol for keeping the area alive by writing to an agreed-upon location, but that would add a fair amount of complexity, and might be triggered accidentally in the case of a garbage collection event on the {{DFSClient}} or {{DataNode}}. Another issue is that there is no way for the {{DataNode}} to revoke access to this shared memory segment. If the {{DFSClient}} wants to hold on to it forever, leaking memory, it can do that. This opens a hole. The client might not have UNIX permissions to grab space in {{/dev/shm}}, but through this mechanism it can consume an arbitrary amount of space there. The other way (let's call this choice #2) is for the client to keep open the Domain socket it used to request the two file descriptors. If we can listen for messages sent on this socket, we can have a truly edge-triggered notification method. The messages can be as short as a single byte, since we have very simple message needs. This requires adding an epoll loop to handle these notifications without consuming a whole thread per socket. Regardless of whether we go with choice #1 or #2, there are some other things that need to be done. * Right now, we don't allow {{BlockReaderLocal}} instances to share file descriptors with each other. However, this would be advisable, to avoid creating 100 pipes/shm areas when someone re-opens the same file 100 times. Doing this is actually an easy change (I wrote and tested the patch already). * We need to revise {{FileInputStreamCache}} to store the communication method (pipe or shared memory area) which will be giving us notifications. This cache also needs to get support for dealing with mmap regions, and for BRL instances sharing FDs / mmaps. I have a patch which reworks this cache, but it's not quite done yet. * {{BlockReaderLocal}} needs to get support for switching back and forth between honoring checksums and not. I have a patch which substantially reworks BRL to add this capability, which I'm considering posting as a separate JIRA. BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid - Key: HDFS-5182 URL: https://issues.apache.org/jira/browse/HDFS-5182 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client Affects Versions: 3.0.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid. This implies adding a new field to the response to REQUEST_SHORT_CIRCUIT_FDS. We also need some kind of heartbeat from the client to the DN, so that the DN can inform the client when the mapped region is no longer locked into memory. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5514) FSNamesystem's fsLock should allow custom implementation
[ https://issues.apache.org/jira/browse/HDFS-5514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13838184#comment-13838184 ] Hadoop QA commented on HDFS-5514: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12616827/HDFS-5514.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5625//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5625//console This message is automatically generated. FSNamesystem's fsLock should allow custom implementation Key: HDFS-5514 URL: https://issues.apache.org/jira/browse/HDFS-5514 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Attachments: HDFS-5514.patch, HDFS-5514.patch Changing {{fsLock}} from a {{ReentrantReadWriteLock}} to an API compatible class that encapsulates the rwLock will allow for more sophisticated locking implementations such as fine grain locking. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5597) DistributedFileSystem: implement modifyAclEntries, removeAclEntries and removeAcl.
[ https://issues.apache.org/jira/browse/HDFS-5597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5597: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) DistributedFileSystem: implement modifyAclEntries, removeAclEntries and removeAcl. -- Key: HDFS-5597 URL: https://issues.apache.org/jira/browse/HDFS-5597 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{modifyAclEntries}}, {{removeAclEntries}} and {{removeAcl}} in {{DistributedFileSystem}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5616) NameNode: implement default ACL handling.
[ https://issues.apache.org/jira/browse/HDFS-5616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5616: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: implement default ACL handling. - Key: HDFS-5616 URL: https://issues.apache.org/jira/browse/HDFS-5616 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test handling of default ACLs within NameNode. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5602) FsShell CLI: add setfacl flag for removal of default ACL entries.
[ https://issues.apache.org/jira/browse/HDFS-5602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5602: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) FsShell CLI: add setfacl flag for removal of default ACL entries. - Key: HDFS-5602 URL: https://issues.apache.org/jira/browse/HDFS-5602 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test setfacl support for removal of just the default entries in an ACL. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5601) FsShell CLI: add setfacl flags for ACL entry modification and removal.
[ https://issues.apache.org/jira/browse/HDFS-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5601: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) FsShell CLI: add setfacl flags for ACL entry modification and removal. -- Key: HDFS-5601 URL: https://issues.apache.org/jira/browse/HDFS-5601 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test setfacl support for flags that allow partial modification of an ACL and modification of specific ACL entries. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5607) libHDFS: add support for recursive flag in ACL functions.
[ https://issues.apache.org/jira/browse/HDFS-5607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5607: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) libHDFS: add support for recursive flag in ACL functions. - Key: HDFS-5607 URL: https://issues.apache.org/jira/browse/HDFS-5607 Project: Hadoop HDFS Issue Type: Sub-task Components: libhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test handling of recursive flag for all ACL functions in libHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5598) DistributedFileSystem: implement removeDefaultAcl.
[ https://issues.apache.org/jira/browse/HDFS-5598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5598: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) DistributedFileSystem: implement removeDefaultAcl. -- Key: HDFS-5598 URL: https://issues.apache.org/jira/browse/HDFS-5598 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs-client, namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{removeDefaultAcl}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5603) FsShell CLI: add support for recursive flag in ACL commands.
[ https://issues.apache.org/jira/browse/HDFS-5603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5603: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) FsShell CLI: add support for recursive flag in ACL commands. Key: HDFS-5603 URL: https://issues.apache.org/jira/browse/HDFS-5603 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test handling of recursive flag for getfacl and setfacl. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5623) NameNode: add tests for skipping ACL enforcement when permission checks are disabled, user is superuser or user is member of supergroup.
[ https://issues.apache.org/jira/browse/HDFS-5623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5623: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: add tests for skipping ACL enforcement when permission checks are disabled, user is superuser or user is member of supergroup. Key: HDFS-5623 URL: https://issues.apache.org/jira/browse/HDFS-5623 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth The existing permission checks are skipped under the following conditions: * {{dfs.permissions.enabled}} is set to false. (There are several exceptions stated in the documentation.) * The user is the super-user. * The user is a member of the super-user group. Add tests verifying that ACL enforcement is also skipped for all of these cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5608) WebHDFS: implement GETACLS and SETACL.
[ https://issues.apache.org/jira/browse/HDFS-5608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5608: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) WebHDFS: implement GETACLS and SETACL. -- Key: HDFS-5608 URL: https://issues.apache.org/jira/browse/HDFS-5608 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{GETACLS}} and {{SETACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5619) NameNode: record ACL modifications to edit log.
[ https://issues.apache.org/jira/browse/HDFS-5619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5619: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: record ACL modifications to edit log. --- Key: HDFS-5619 URL: https://issues.apache.org/jira/browse/HDFS-5619 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement a new edit log opcode, {{OP_SET_ACL}}, which fully replaces the ACL of a specific inode. For ACL operations that perform partial modification of the ACL, the NameNode must merge the modifications with the existing ACL to produce the final resulting ACL and encode it into an {{OP_SET_ACL}}. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5595) NameNode: implement AclManager as abstraction over INode ACL Map.
[ https://issues.apache.org/jira/browse/HDFS-5595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5595: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: implement AclManager as abstraction over INode ACL Map. - Key: HDFS-5595 URL: https://issues.apache.org/jira/browse/HDFS-5595 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Assignee: Chris Nauroth Complete an initial implementation of {{AclManager}} to enable further development tasks. This will be a basic implementation using the INode ACL Map to track associations between inodes and ACLs. This will not fully implement all of the optimizations discussed in the design doc. Further optimization work will be tracked in separate tasks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5609) WebHDFS: implement MODIFYACLENTRIES, REMOVEACLENTRIES and REMOVEACL.
[ https://issues.apache.org/jira/browse/HDFS-5609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5609: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) WebHDFS: implement MODIFYACLENTRIES, REMOVEACLENTRIES and REMOVEACL. Key: HDFS-5609 URL: https://issues.apache.org/jira/browse/HDFS-5609 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{MODIFYACLENTRIES}}, {{REMOVEACLENTRIES}} and {{REMOVEACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5613) NameNode: implement handling of ACLs in combination with symlinks.
[ https://issues.apache.org/jira/browse/HDFS-5613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5613: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: implement handling of ACLs in combination with symlinks. -- Key: HDFS-5613 URL: https://issues.apache.org/jira/browse/HDFS-5613 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth A symlink in HDFS does not have an ACL of its own. Operations that modify the ACL of a symlink instead modify the target of the symlink. For operations that enforce ACLs, enforcement is on the target of the symlink. This is similar to existing handling of permissions for symlinks. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5618) NameNode: persist ACLs in fsimage.
[ https://issues.apache.org/jira/browse/HDFS-5618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5618: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: persist ACLs in fsimage. -- Key: HDFS-5618 URL: https://issues.apache.org/jira/browse/HDFS-5618 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Store ACLs in fsimage so that ACLs are retained across NameNode restarts. This requires encoding and saving the {{AclManager}} state as a new section of the fsimage, located after all existing sections (snapshot manager state, inodes, secret manager state, and cache manager state). -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5615) NameNode: implement handling of ACLs in combination with sticky bit.
[ https://issues.apache.org/jira/browse/HDFS-5615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5615: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: implement handling of ACLs in combination with sticky bit. Key: HDFS-5615 URL: https://issues.apache.org/jira/browse/HDFS-5615 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth The sticky bit must work in combination with ACLs, similar to how the sticky bit already works with permissions. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5600) FsShell CLI: add getfacl and setfacl with minimal support for getting and setting ACLs.
[ https://issues.apache.org/jira/browse/HDFS-5600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5600: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) FsShell CLI: add getfacl and setfacl with minimal support for getting and setting ACLs. --- Key: HDFS-5600 URL: https://issues.apache.org/jira/browse/HDFS-5600 Project: Hadoop HDFS Issue Type: Sub-task Components: tools Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test FsShell CLI commands for getfacl and setfacl. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5610) WebHDFS: implement REMOVEDEFAULTACL.
[ https://issues.apache.org/jira/browse/HDFS-5610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5610: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) WebHDFS: implement REMOVEDEFAULTACL. Key: HDFS-5610 URL: https://issues.apache.org/jira/browse/HDFS-5610 Project: Hadoop HDFS Issue Type: Sub-task Components: webhdfs Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Implement and test {{REMOVEDEFAULTACL}} in WebHDFS. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5620) NameNode: enhance AclManager to use Global ACL Set as a space optimization.
[ https://issues.apache.org/jira/browse/HDFS-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5620: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: enhance AclManager to use Global ACL Set as a space optimization. --- Key: HDFS-5620 URL: https://issues.apache.org/jira/browse/HDFS-5620 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth The {{AclManager}} can maintain a Global ACL Set to store all distinct ACLs in use by the file system. All inodes that have the same ACL entries can share the same ACL instance. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5612) NameNode: change all permission checks to enforce ACLs in addition to permissions.
[ https://issues.apache.org/jira/browse/HDFS-5612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5612: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: change all permission checks to enforce ACLs in addition to permissions. -- Key: HDFS-5612 URL: https://issues.apache.org/jira/browse/HDFS-5612 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth All {{NameNode}} code paths that enforce permissions must be updated so that they also enforce ACLs. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (HDFS-5614) NameNode: implement handling of ACLs in combination with snapshots.
[ https://issues.apache.org/jira/browse/HDFS-5614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-5614: Target Version/s: HDFS ACLs (HDFS-4685) Affects Version/s: HDFS ACLs (HDFS-4685) NameNode: implement handling of ACLs in combination with snapshots. --- Key: HDFS-5614 URL: https://issues.apache.org/jira/browse/HDFS-5614 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: HDFS ACLs (HDFS-4685) Reporter: Chris Nauroth Within a snapshot, all ACLs are frozen at the moment that the snapshot was created. ACL changes in the parent of the snapshot are not applied to the snapshot. -- This message was sent by Atlassian JIRA (v6.1#6144)