[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException
[ https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043082#comment-14043082 ] Yongjun Zhang commented on HDFS-6475: - Thanks a lot [~atm]! Many thanks to [~daryn] and [~jingzhao] for the review and comments. I will follow up with the getTrueCause issue in HDFS-6588. WebHdfs clients fail without retry because incorrect handling of StandbyException - Key: HDFS-6475 URL: https://issues.apache.org/jira/browse/HDFS-6475 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Fix For: 2.5.0 Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, HDFS-6475.008.patch, HDFS-6475.009.patch With WebHdfs clients connected to a HA HDFS service, the delegation token is previously initialized with the active NN. When clients try to issue request, the NN it contacts is stored in a map returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact the NN based on the order, so likely the first one it runs into is StandbyNN. If the StandbyNN doesn't have the updated client crediential, it will throw a s SecurityException that wraps StandbyException. The client is expected to retry another NN, but due to the insufficient handling of SecurityException mentioned above, it failed. Example message: {code} {RemoteException={message=Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException, javaCl assName=java.lang.SecurityException, exception=SecurityException}} org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696) at kclient1.kclient$1.run(kclient.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528) at kclient1.kclient.main(kclient.java:58) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043087#comment-14043087 ] Hadoop QA commented on HDFS-6595: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652334/HDFS-6595.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7228//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7228//console This message is automatically generated. Configure the maximum threads allowed for balancing on datanodes Key: HDFS-6595 URL: https://issues.apache.org/jira/browse/HDFS-6595 Project: Hadoop HDFS Issue Type: Improvement Components: datanode Reporter: Benoy Antony Assignee: Benoy Antony Attachments: HDFS-6595.patch, HDFS-6595.patch Currently datanode allows a max of 5 threads to be used for balancing. In some cases, , it may make sense to use a different number of threads to the purpose of moving. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043160#comment-14043160 ] Hadoop QA commented on HDFS-2856: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652338/HDFS-2856.4.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover org.apache.hadoop.hdfs.server.balancer.TestBalancerWithSaslDataTransfer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7229//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7229//console This message is automatically generated. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6591: Attachment: HDFS-6591.txt while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6591: Attachment: (was: HDFS-6591.txt) while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Attachment: HDFS-6596.1.patch An initial patch. Since {{DataInputStream#readfully(byte[], int, int)}} is final and {{FSDataInputStream}} can't override it, so we implement a readFully with ByteBuffer. Improve InputStream when read spans two blocks -- Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Attachments: HDFS-6596.1.patch In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} From the above code, we can conclude that the read will return at most (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the caller must call read() second time to complete the request, and must wait second time to acquire the DFSInputStream lock(read() is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this will result in latency pain point when they under massive race conditions. So here we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043285#comment-14043285 ] Liang Xie commented on HDFS-6591: - Retry. After a debugging, showed a rare race. the CountDownLatch is inside Callable, but there's no guarantee: when a countDown happened, then one of tasks has done. see: http://stackoverflow.com/questions/9604713/future-isdone-returns-false-even-if-the-task-is-done . really tricky... I rewrote the sync related code in the latest patch. and passed all the TestPread case in a shell loop:) while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6591: Attachment: (was: HDFS-6591.txt) while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6591: Attachment: HDFS-6591.txt while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure
[ https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043323#comment-14043323 ] Hudson commented on HDFS-6587: -- FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/594/]) HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by Yongjun Zhang. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js Bug in TestBPOfferService can cause test failure Key: HDFS-6587 URL: https://issues.apache.org/jira/browse/HDFS-6587 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Zhilei Xu Assignee: Zhilei Xu Fix For: 3.0.0, 2.5.0 Attachments: patch_TestBPOfferService.txt need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the trunk, e.g. in Build #1781. Details: in this test, the utility function waitForBlockReceived() has a bug: parameter mockNN is never used but hard-coded mockNN1 is used. This bug introduces undeterministic test failure when testBasicFunctionality() calls ret = waitForBlockReceived(FAKE_BLOCK, mockNN2); and the call finishes before the actual interaction with mockNN2 happens. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable
[ https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043335#comment-14043335 ] Hudson commented on HDFS-6593: -- FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/594/]) HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. Contributed by Jing Zhao. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java Move SnapshotDiffInfo out of INodeDirectorySnapshottable Key: HDFS-6593 URL: https://issues.apache.org/jira/browse/HDFS-6593 Project: Hadoop HDFS Issue Type: Improvement Components: namenode, snapshots Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, HDFS-6593.002.patch Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of INodeDirectorySnapshottable as an individual class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6486) Add user doc for XAttrs via WebHDFS.
[ https://issues.apache.org/jira/browse/HDFS-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043327#comment-14043327 ] Hudson commented on HDFS-6486: -- FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/594/]) HDFS-6486. Add user doc for XAttrs via WebHDFS. Contributed by Yi Liu. (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605062) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm Add user doc for XAttrs via WebHDFS. Key: HDFS-6486 URL: https://issues.apache.org/jira/browse/HDFS-6486 Project: Hadoop HDFS Issue Type: Task Components: webhdfs Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6486.patch Add the user doc for XAttrs via WebHDFS. Set xattr: {code} curl -i -X PUT 'http://HOST:PORT/webhdfs/v1/PATH?op=SETXATTRxattr.name=XATTRNAMExattr.value=XATTRVALUEflag=FLAG' {code} Remove xattr: {code} curl -i -X PUT 'http://HOST:PORT/webhdfs/v1/PATH?op=REMOVEXATTRxattr.name=XATTRNAME' {code} Get an xattr: {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAMEencoding=ENCODING' {code} Get multiple xattrs (XATTRNAME1, XATTRNAME2, XATTRNAME3): {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAME1xattr.name=XATTRNAME2xattr.name=XATTRNAME3encoding=ENCODING' {code} Get all xattrs: {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSencoding=ENCODING' {code} List xattrs {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=LISTXATTRS' {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support
[ https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043337#comment-14043337 ] Hudson commented on HDFS-6430: -- FAILURE: Integrated in Hadoop-Yarn-trunk #594 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/594/]) HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HTTPFS - Implement XAttr support Key: HDFS-6430 URL: https://issues.apache.org/jira/browse/HDFS-6430 Project: Hadoop HDFS Issue Type: Task Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.5.0 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch Add xattr support to HttpFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zesheng Wu updated HDFS-6596: - Status: Patch Available (was: Open) Improve InputStream when read spans two blocks -- Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Attachments: HDFS-6596.1.patch In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} From the above code, we can conclude that the read will return at most (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the caller must call read() second time to complete the request, and must wait second time to acquire the DFSInputStream lock(read() is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this will result in latency pain point when they under massive race conditions. So here we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043365#comment-14043365 ] Uma Maheswara Rao G commented on HDFS-6556: --- Thanks a lot, Yi for the review. [~andrew.wang] or [~cnauroth] , do you want to take a look ? If any of you +1, I can go for commit. Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043411#comment-14043411 ] Hadoop QA commented on HDFS-6591: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652381/HDFS-6591.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7230//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7230//console This message is automatically generated. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043433#comment-14043433 ] Hadoop QA commented on HDFS-6591: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652383/HDFS-6591.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancer {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7232//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7232//console This message is automatically generated. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043443#comment-14043443 ] Hadoop QA commented on HDFS-6591: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652383/HDFS-6591.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7231//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7231//console This message is automatically generated. while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException
[ https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043493#comment-14043493 ] Hudson commented on HDFS-6475: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/]) HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605217) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java WebHdfs clients fail without retry because incorrect handling of StandbyException - Key: HDFS-6475 URL: https://issues.apache.org/jira/browse/HDFS-6475 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Fix For: 2.5.0 Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, HDFS-6475.008.patch, HDFS-6475.009.patch With WebHdfs clients connected to a HA HDFS service, the delegation token is previously initialized with the active NN. When clients try to issue request, the NN it contacts is stored in a map returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact the NN based on the order, so likely the first one it runs into is StandbyNN. If the StandbyNN doesn't have the updated client crediential, it will throw a s SecurityException that wraps StandbyException. The client is expected to retry another NN, but due to the insufficient handling of SecurityException mentioned above, it failed. Example message: {code} {RemoteException={message=Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException, javaCl assName=java.lang.SecurityException, exception=SecurityException}} org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696) at kclient1.kclient$1.run(kclient.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528) at kclient1.kclient.main(kclient.java:58) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure
[ https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043491#comment-14043491 ] Hudson commented on HDFS-6587: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/]) HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by Yongjun Zhang. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js Bug in TestBPOfferService can cause test failure Key: HDFS-6587 URL: https://issues.apache.org/jira/browse/HDFS-6587 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Zhilei Xu Assignee: Zhilei Xu Fix For: 3.0.0, 2.5.0 Attachments: patch_TestBPOfferService.txt need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the trunk, e.g. in Build #1781. Details: in this test, the utility function waitForBlockReceived() has a bug: parameter mockNN is never used but hard-coded mockNN1 is used. This bug introduces undeterministic test failure when testBasicFunctionality() calls ret = waitForBlockReceived(FAKE_BLOCK, mockNN2); and the call finishes before the actual interaction with mockNN2 happens. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6486) Add user doc for XAttrs via WebHDFS.
[ https://issues.apache.org/jira/browse/HDFS-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043495#comment-14043495 ] Hudson commented on HDFS-6486: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/]) HDFS-6486. Add user doc for XAttrs via WebHDFS. Contributed by Yi Liu. (umamahesh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605062) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/WebHDFS.apt.vm Add user doc for XAttrs via WebHDFS. Key: HDFS-6486 URL: https://issues.apache.org/jira/browse/HDFS-6486 Project: Hadoop HDFS Issue Type: Task Components: webhdfs Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Priority: Minor Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6486.patch Add the user doc for XAttrs via WebHDFS. Set xattr: {code} curl -i -X PUT 'http://HOST:PORT/webhdfs/v1/PATH?op=SETXATTRxattr.name=XATTRNAMExattr.value=XATTRVALUEflag=FLAG' {code} Remove xattr: {code} curl -i -X PUT 'http://HOST:PORT/webhdfs/v1/PATH?op=REMOVEXATTRxattr.name=XATTRNAME' {code} Get an xattr: {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAMEencoding=ENCODING' {code} Get multiple xattrs (XATTRNAME1, XATTRNAME2, XATTRNAME3): {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSxattr.name=XATTRNAME1xattr.name=XATTRNAME2xattr.name=XATTRNAME3encoding=ENCODING' {code} Get all xattrs: {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=GETXATTRSencoding=ENCODING' {code} List xattrs {code} curl -i 'http://HOST:PORT/webhdfs/v1/PATH?op=LISTXATTRS' {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support
[ https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043505#comment-14043505 ] Hudson commented on HDFS-6430: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/]) HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HTTPFS - Implement XAttr support Key: HDFS-6430 URL: https://issues.apache.org/jira/browse/HDFS-6430 Project: Hadoop HDFS Issue Type: Task Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.5.0 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch Add xattr support to HttpFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable
[ https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043503#comment-14043503 ] Hudson commented on HDFS-6593: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1785 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1785/]) HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. Contributed by Jing Zhao. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java Move SnapshotDiffInfo out of INodeDirectorySnapshottable Key: HDFS-6593 URL: https://issues.apache.org/jira/browse/HDFS-6593 Project: Hadoop HDFS Issue Type: Improvement Components: namenode, snapshots Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, HDFS-6593.002.patch Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of INodeDirectorySnapshottable as an individual class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6596) Improve InputStream when read spans two blocks
[ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043545#comment-14043545 ] Hadoop QA commented on HDFS-6596: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652382/HDFS-6596.1.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7233//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7233//console This message is automatically generated. Improve InputStream when read spans two blocks -- Key: HDFS-6596 URL: https://issues.apache.org/jira/browse/HDFS-6596 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.4.0 Reporter: Zesheng Wu Assignee: Zesheng Wu Attachments: HDFS-6596.1.patch In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented as following: {code} int realLen = (int) Math.min(len, (blockEnd - pos + 1L)); if (locatedBlocks.isLastBlockComplete()) { realLen = (int) Math.min(realLen, locatedBlocks.getFileLength()); } int result = readBuffer(strategy, off, realLen, corruptedBlockMap); {code} From the above code, we can conclude that the read will return at most (blockEnd - pos + 1) bytes. As a result, when read spans two blocks, the caller must call read() second time to complete the request, and must wait second time to acquire the DFSInputStream lock(read() is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this will result in latency pain point when they under massive race conditions. So here we propose that we should loop internally in read() to do best effort read. In the current implementation of pread(read(position, buffer, offset, lenght)), it does loop internally to do best effort read. So we can refactor to support this on normal read. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043586#comment-14043586 ] Kihwal Lee commented on HDFS-6527: -- [~jingzhao] You are right. Since it reresolves inside the write lock, it will detect the deletion. I will revert it from 2.4.1. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 2.4.1 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-6527. -- Resolution: Fixed Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-6527: - Fix Version/s: (was: 2.4.1) 2.5.0 3.0.0 Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6527) Edit log corruption due to defered INode removal
[ https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043596#comment-14043596 ] Kihwal Lee commented on HDFS-6527: -- Reverted it from branch-2.4.1 and also updated the release note. Edit log corruption due to defered INode removal Key: HDFS-6527 URL: https://issues.apache.org/jira/browse/HDFS-6527 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.4.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Fix For: 3.0.0, 2.5.0 Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, HDFS-6527.v2.patch, HDFS-6527.v3.patch, HDFS-6527.v4.patch, HDFS-6527.v5.patch We have seen a SBN crashing with the following error: {panel} \[Edit log tailer\] ERROR namenode.FSEditLogLoader: Encountered exception on operation AddBlockOp [path=/xxx, penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=, RpcCallId=-2] java.io.FileNotFoundException: File does not exist: /xxx {panel} This was caused by the deferred removal of deleted inodes from the inode map. Since getAdditionalBlock() acquires FSN read lock and then write lock, a deletion can happen in between. Because of deferred inode removal outside FSN write lock, getAdditionalBlock() can get the deleted inode from the inode map with FSN write lock held. This allow addition of a block to a deleted file. As a result, the edit log will contain OP_ADD, OP_DELETE, followed by OP_ADD_BLOCK. This cannot be replayed by NN, so NN doesn't start up or SBN crashes. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6601) Issues in finalizing rolling upgrade when there is a layout version change
[ https://issues.apache.org/jira/browse/HDFS-6601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-6601: - Priority: Blocker (was: Major) Issues in finalizing rolling upgrade when there is a layout version change -- Key: HDFS-6601 URL: https://issues.apache.org/jira/browse/HDFS-6601 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Blocker Attachments: HDFS-6601.patch After HDFS-6545, we have noticed a couple of issues. - The storage dir's VERSION file is not properly updated. This becomes a problem when there is a layout version change. We can have the finalization do {{storage.writeAll()}} - {{OP_ROLLING_UPGRADE_FINALIZE}} cannot be replayed, once the corresponding {{OP_ROLLING_UPGRADE_START}} is consumed and a new fsimage is created (e.g. rollback image). On restart, NN terminates complaining it can't finalize something that it didn't start. We can make NN ignore {{OP_ROLLING_UPGRADE_FINALIZE}} if no rolling upgrade is in progress. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6597) Add a new option to NN upgrade to terminate the process after upgrade on NN is completed
[ https://issues.apache.org/jira/browse/HDFS-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043654#comment-14043654 ] Danilo Vunjak commented on HDFS-6597: - Hi guys, You have point when saying -force is not right name. I would pick -upgradeOnly as maybe best option. What is your opinion? [~jingzhao] Yes, NN is included in all services. Datanodes need it up to upgrade itself. Thanks, Danilo Add a new option to NN upgrade to terminate the process after upgrade on NN is completed Key: HDFS-6597 URL: https://issues.apache.org/jira/browse/HDFS-6597 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Danilo Vunjak Attachments: JIRA-HDFS-30.patch Currently when namenode is started for upgrade (hadoop namenode -upgrade command), after finishing upgrade of metadata, namenode starts working normally and wait for datanodes to upgrade itself and connect to to NN. We need to have option for upgrading only NN metadata, so after upgrade is finished on NN, process should terminate. I have tested it by changing in file: hdfs.server.namenode.NameNode.java, method: public static NameNode createNameNode(String argv[], Configuration conf): in switch added case UPGRADE: case UPGRADE: { DefaultMetricsSystem.initialize(NameNode); NameNode nameNode = new NameNode(conf); if (startOpt.getForceUpgrade()) { terminate(0); return null; } return nameNode; } This did upgrade of metadata, closed process after finished, and later when all services were started, upgrade of datanodes finished sucessfully and system run . What I'm suggesting right now is to add new startup parameter -force, so namenode can be started like this hadoop namenode -upgrade -force, so we can indicate that we want to terminate process after upgrade metadata on NN is finished. Old functionality should be preserved, so users can run hadoop namenode -upgrade on same way and with same behaviour as it was previous. Thanks, Danilo -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6475) WebHdfs clients fail without retry because incorrect handling of StandbyException
[ https://issues.apache.org/jira/browse/HDFS-6475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043677#comment-14043677 ] Hudson commented on HDFS-6475: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/]) HDFS-6475. WebHdfs clients fail without retry because incorrect handling of StandbyException. Contributed by Yongjun Zhang. (atm: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605217) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/resources/ExceptionHandler.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestDelegationTokensWithHA.java WebHdfs clients fail without retry because incorrect handling of StandbyException - Key: HDFS-6475 URL: https://issues.apache.org/jira/browse/HDFS-6475 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.4.0 Reporter: Yongjun Zhang Assignee: Yongjun Zhang Fix For: 2.5.0 Attachments: HDFS-6475.001.patch, HDFS-6475.002.patch, HDFS-6475.003.patch, HDFS-6475.003.patch, HDFS-6475.004.patch, HDFS-6475.005.patch, HDFS-6475.006.patch, HDFS-6475.007.patch, HDFS-6475.008.patch, HDFS-6475.009.patch With WebHdfs clients connected to a HA HDFS service, the delegation token is previously initialized with the active NN. When clients try to issue request, the NN it contacts is stored in a map returned by DFSUtil.getNNServiceRpcAddresses(conf). And the client contact the NN based on the order, so likely the first one it runs into is StandbyNN. If the StandbyNN doesn't have the updated client crediential, it will throw a s SecurityException that wraps StandbyException. The client is expected to retry another NN, but due to the insufficient handling of SecurityException mentioned above, it failed. Example message: {code} {RemoteException={message=Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException, javaCl assName=java.lang.SecurityException, exception=SecurityException}} org.apache.hadoop.ipc.RemoteException(java.lang.SecurityException): Failed to obtain user group information: org.apache.hadoop.security.token.SecretManager$InvalidToken: StandbyException at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:159) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:325) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$700(WebHdfsFileSystem.java:107) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.getResponse(WebHdfsFileSystem.java:635) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:542) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:431) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:685) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:696) at kclient1.kclient$1.run(kclient.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:356) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528) at kclient1.kclient.main(kclient.java:58) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6430) HTTPFS - Implement XAttr support
[ https://issues.apache.org/jira/browse/HDFS-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043688#comment-14043688 ] Hudson commented on HDFS-6430: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/]) HDFS-6430. HTTPFS - Implement XAttr support. (Yi Liu via tucu) (tucu: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605118) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSUtils.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/FSOperations.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/server/HttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/EnumSetParam.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/Parameters.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/lib/wsrs/ParametersProvider.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/client/BaseTestHttpFSWith.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/fs/http/server/TestHttpFSServerNoXAttrs.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/src/test/java/org/apache/hadoop/test/TestHdfsHelper.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt HTTPFS - Implement XAttr support Key: HDFS-6430 URL: https://issues.apache.org/jira/browse/HDFS-6430 Project: Hadoop HDFS Issue Type: Task Affects Versions: 3.0.0 Reporter: Yi Liu Assignee: Yi Liu Fix For: 2.5.0 Attachments: HDFS-6430.1.patch, HDFS-6430.2.patch, HDFS-6430.3.patch, HDFS-6430.4.patch, HDFS-6430.5.patch, HDFS-6430.patch Add xattr support to HttpFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6593) Move SnapshotDiffInfo out of INodeDirectorySnapshottable
[ https://issues.apache.org/jira/browse/HDFS-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043686#comment-14043686 ] Hudson commented on HDFS-6593: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/]) HDFS-6593. Move SnapshotDiffInfo out of INodeDirectorySnapshottable. Contributed by Jing Zhao. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605169) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotDiffReport.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/DirectoryWithSnapshotFeature.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/INodeDirectorySnapshottable.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotDiffInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/snapshot/SnapshotManager.java Move SnapshotDiffInfo out of INodeDirectorySnapshottable Key: HDFS-6593 URL: https://issues.apache.org/jira/browse/HDFS-6593 Project: Hadoop HDFS Issue Type: Improvement Components: namenode, snapshots Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6593.000.patch, HDFS-6593.001.patch, HDFS-6593.002.patch Per discussion in HDFS-4667, we can move SnapshotDiffInfo out of INodeDirectorySnapshottable as an individual class. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6587) Bug in TestBPOfferService can cause test failure
[ https://issues.apache.org/jira/browse/HDFS-6587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043675#comment-14043675 ] Hudson commented on HDFS-6587: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1812 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1812/]) HDFS-6587. Fix a typo in message issued from explorer.js. Contributed by Yongjun Zhang. (wheat9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605184) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/explorer.js Bug in TestBPOfferService can cause test failure Key: HDFS-6587 URL: https://issues.apache.org/jira/browse/HDFS-6587 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Zhilei Xu Assignee: Zhilei Xu Fix For: 3.0.0, 2.5.0 Attachments: patch_TestBPOfferService.txt need to fix a bug in TestBPOfferService#waitForBlockReceived that fails the trunk, e.g. in Build #1781. Details: in this test, the utility function waitForBlockReceived() has a bug: parameter mockNN is never used but hard-coded mockNN1 is used. This bug introduces undeterministic test failure when testBasicFunctionality() calls ret = waitForBlockReceived(FAKE_BLOCK, mockNN2); and the call finishes before the actual interaction with mockNN2 happens. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing
Kihwal Lee created HDFS-6602: Summary: PendingDeletionBlocks on SBN keeps increasing Key: HDFS-6602 URL: https://issues.apache.org/jira/browse/HDFS-6602 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Priority: Critical PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It means this data structure is populated, but IBR (incremental block reports) do not cause deleted blocks to be removed from it. As a result, the heap usage keeps increasing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing
[ https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043736#comment-14043736 ] Kihwal Lee commented on HDFS-6602: -- Since {{ReplicationMonitor}} is not running on SBN, {{invalidateBlocks}} is not consumed. Only when the SBN becomes active, it will be cleared. {{invalidateBlocks}} is populated during block report processing. I think no queue including {{invalidateBlocks}} should be populated in standby. PendingDeletionBlocks on SBN keeps increasing - Key: HDFS-6602 URL: https://issues.apache.org/jira/browse/HDFS-6602 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Priority: Critical PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It means this data structure is populated, but IBR (incremental block reports) do not cause deleted blocks to be removed from it. As a result, the heap usage keeps increasing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing
[ https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043736#comment-14043736 ] Kihwal Lee edited comment on HDFS-6602 at 6/25/14 5:05 PM: --- Since {{ReplicationMonitor}} is not running or not generating any work on SBN, {{invalidateBlocks}} is not consumed. Only when SBN becomes active, it will be cleared. {{invalidateBlocks}} is populated during block report processing. I think no queues including {{invalidateBlocks}} should be populated in standby. was (Author: kihwal): Since {{ReplicationMonitor}} is not running on SBN, {{invalidateBlocks}} is not consumed. Only when the SBN becomes active, it will be cleared. {{invalidateBlocks}} is populated during block report processing. I think no queue including {{invalidateBlocks}} should be populated in standby. PendingDeletionBlocks on SBN keeps increasing - Key: HDFS-6602 URL: https://issues.apache.org/jira/browse/HDFS-6602 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Priority: Critical PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It means this data structure is populated, but IBR (incremental block reports) do not cause deleted blocks to be removed from it. As a result, the heap usage keeps increasing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6556) Refine XAttr permissions
[ https://issues.apache.org/jira/browse/HDFS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043753#comment-14043753 ] Chris Nauroth commented on HDFS-6556: - Hi, [~umamaheswararao]. The patch looks good. I have one minor suggestion. I see this code block is repeated in {{FSNamesystem#setXAttrInt}} and {{FSNamesystem#removeXAttr}}: {code} if (isPermissionEnabled xAttr.getNameSpace() == XAttr.NameSpace.USER) { if (isStickyBitDirectory(src)) { if (!pc.isSuperUser()) { checkOwner(pc, src); } } else { checkPathAccess(pc, src, FsAction.WRITE); } } {code} We could remove the {{isStickyBitDirectory}} method and instead add a method named something like {{checkXAttrChangeAccess}} that fully encapsulates all of the above logic. This would reduce code duplication. What do you think? Refine XAttr permissions Key: HDFS-6556 URL: https://issues.apache.org/jira/browse/HDFS-6556 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.5.0 Reporter: Yi Liu Assignee: Uma Maheswara Rao G Attachments: RefinedPermissions-HDFS-6556-1.patch, RefinedPermissions-HDFS-6556.patch, refinedPermissions-HDFS-6556-2.patch After discuss with Uma, we should refine setting permissions of {{user}} and {{trusted}} namespace xattrs. *1.* For {{user}} namespace xattrs, In HDFS-6374, says setXAttr should require the user to be the owner of the file or directory, we have a bit misunderstanding. It actually is: {quote} The access permissions for user attributes are defined by the file permission bits. only regular files and directories can have extended attributes. For sticky directories, only the owner and privileged user can write attributes. {quote} We can refer to linux source code in http://lxr.free-electrons.com/source/fs/xattr.c?v=2.6.35 I also check in linux, it's controlled by the file permission bits for regular files and directories (not sticky). *2.* For {{trusted}} namespace, currently we require the user should be owner and superuser. Actually superuser is enough. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043766#comment-14043766 ] Owen O'Malley commented on HDFS-6134: - {quote} I don’t see a previous -1 in any of the related JIRAs. {quote} I had consistently stated objections and some of them have been addressed, but the fundamentals have become clear through this jira. I am always hesitant to use a -1 and I certainly don't do so lightly. Through the discussion, my opinion is transparent encryption in HDFS is a *really* bad idea. Let's run through the case: The one claimed benefit of integrating encryption into HDFS is that the user doesn't need to change the URLs that they use. I believe this to be a *disadvantage* because it hides the fact that these files are encrypted. That said, a better approach if that is the desired goal is to create a *NEW* filter filesystem that the user can configure to respond to hdfs urls that does silent encryption. This imposes *NO* penalty on people who don't want encryption and does not require hacks to the FileSystem API. {quote} FileSystem will had a new create()/open() signature to support this, if you have access to the file but not the key, you can use the new signatures to copy files as per the usecase you are mentioning. {quote} This will break every backup application. Some of them, such as HAR and DistCp you can hack to handle HDFS as a special case, but this kind of special casing always comes back to haunt us as a project. Changing FileSystem API is a really bad idea and inducing more differences between the various implementations will create many more problems than you are trying to avoid. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043780#comment-14043780 ] Todd Lipcon commented on HDFS-6134: --- bq. The one claimed benefit of integrating encryption into HDFS is that the user doesn't need to change the URLs that they use. I believe this to be a disadvantage because it hides the fact that these files are encrypted This is the transparent part of the design, and it's billed as a positive feature in many products in the storage market. For example, from the NetApp Storage Encryption (NSE) [datasheet|http://www.jivesoftware.com/wp-content/uploads/2014/03/Datasheet-Encryption-at-rest.pdf]: {quote} While higher level SAN and NAS fabric encryption solutions provide more flex- ibility, they can also present a challenge to everyday operations. Data encrypted before it is sent to the storage module cannot be compressed, deduplicated, or scanned for viruses, and it might need to be decrypted before it can be replicated to a backup site or archived to tape. Contrast this with NSE, which transpar- ently supports these NetApp ® storage efficiency features. NSE can help you lower your overall storage costs, while preventing old data from being accessed if a drive is repurposed. {quote} The same advantages hold for HDFS -- if we add features such as transparent compression, it's crucial that the encryption be done _after_ compression. The other point that this datasheet makes is that transparent at-rest encryption acts as a backstop in case an administrator forgets to configure or misconfigures higher-level encryption. That is to say, users may still use encrypted file formats on top of HDFS using a schema like you're proposing, but many regulations require that all data at rest is encrypted. Asking users to configure and use wrapper filesystems leaves it quite possible (even likely) that data will land on HDFS without being encrypted. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS
[ https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043779#comment-14043779 ] Haohui Mai commented on HDFS-5321: -- Hi [~atm], thanks for bringing this up. I understand your concerns on compatibility, but note that {{dfs.http.port}} and {{dfs.https.port}} are private configurations. They are not exposed in {{hdfs-default.xml}}. Since HDFS maintains no compatibility guarantees for private configurations, it should be okay to include this in minor releases. Clean up the HTTP-related configuration in HDFS --- Key: HDFS-5321 URL: https://issues.apache.org/jira/browse/HDFS-5321 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch Currently there are multiple configuration keys that control the ports that the NameNode and DataNode listen to, and the default ports that the hftp/webhdfs clients are connecting to. Below is a quick summary of these configuration: || Keys || Description || | dfs.namenode.http-address | The address that the namenode http server binds to | | dfs.namenode.https-address | The address that the namenode https server binds to | | dfs.http.port | The default port that the hftp/webhdfs client use to connect to the remote server| | dfs.https.port | The default port that the hsftp client use to connect to the remote server| I propose to deprecate dfs.http.port and dfs.https.port to avoid potential confusions (e.g., HDFS-5316). Note that this removes no functionality, since the users can specify ports in hftp / webhdfs URLs when they need to connect to HDFS servers with non-default ports. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-2856: Attachment: HDFS-2856.5.patch The test failures are unrelated. {{TestPipelinesFailover}} has been failing intermittently on other unrelated patches. {{TestBalancerWithSaslDataTransfer}} reruns tests from {{TestBalancer}} under secure configuration, and {{TestBalancer}} also has experienced intermittent failures lately. However, reviewing logs from the test runs made me notice that {{MiniDFSCluster}} was printing a bogus warning about failure to bind to a privileged port, which isn't relevant when SASL is configured on DataTransferProtocol. This could cause confusion for people running the tests in the future, so I'd like to stop those log messages. I'm attaching patch v5 with a minor change in {{MiniDFSCluster}} to stifle the bogus log messages. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6387) HDFS CLI admin tool for creating deleting an encryption zone
[ https://issues.apache.org/jira/browse/HDFS-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043797#comment-14043797 ] Charles Lamb commented on HDFS-6387: Thanks for the review [~cmccabe]. The .004 patch fixes those two minor issues. I also noticed that {{CryptoAdmin.ListZonesCommand#getLongUsage()}} didn't need to create a {{TableListing}} so I removed that (effectively) dead code. HDFS CLI admin tool for creating deleting an encryption zone -- Key: HDFS-6387 URL: https://issues.apache.org/jira/browse/HDFS-6387 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6387.002.patch, HDFS-6387.003.patch, HDFS-6387.1.patch CLI admin tool to create/delete an encryption zone in HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6387) HDFS CLI admin tool for creating deleting an encryption zone
[ https://issues.apache.org/jira/browse/HDFS-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb resolved HDFS-6387. Resolution: Fixed Fix Version/s: fs-encryption (HADOOP-10150 and HDFS-6134) Committed to fs-encryption. HDFS CLI admin tool for creating deleting an encryption zone -- Key: HDFS-6387 URL: https://issues.apache.org/jira/browse/HDFS-6387 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Fix For: fs-encryption (HADOOP-10150 and HDFS-6134) Attachments: HDFS-6387.002.patch, HDFS-6387.003.patch, HDFS-6387.1.patch CLI admin tool to create/delete an encryption zone in HDFS. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043839#comment-14043839 ] Owen O'Malley commented on HDFS-6134: - I'll also point out that I've provided a solution that doesn't change the HDFS core and still lets you use your hdfs urls with encryption... Finally, adding compression to the crypto file system would be a great addition and *still* not require any changes to HDFS or its API. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043836#comment-14043836 ] Owen O'Malley commented on HDFS-6134: - Todd, it is *still* transparent encryption if you use cfs:// instead of hdfs://. The important piece is that the application doesn't need to change to access the decrypted storage. My problem is by refusing to layer the change over the storage layer, this jira is making much disruptive and unnecessary changes to the critical infrastructure and its API. NSE is whole disk encryption and is equivalent to using lm-crypt to encrypt the block files. That level of encryption is always very transparent and is already available in HDFS without a code change. Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning? Say 10am-noon? Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043846#comment-14043846 ] Alejandro Abdelnur commented on HDFS-6134: -- bq. Todd, it is still transparent encryption if you use cfs:// instead of hdfs://. Owen, that is NOT transparent. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043880#comment-14043880 ] Colin Patrick McCabe commented on HDFS-6389: OK, I re-checked this and there are some tests posted, but just in a separate patch file. It looks reasonable, but let's put it all into one patch as per usual. Thanks. Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6389.001.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043893#comment-14043893 ] Owen O'Malley commented on HDFS-6134: - {quote} Owen, that is NOT transparent. {quote} Transparent means that you shouldn't have to change your application code. Hacking HDFS to add encryption is transparent for one set of apps, but completely breaks others. Changing URLs requires no code changes to any apps. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043891#comment-14043891 ] Charles Lamb commented on HDFS-6389: bq. let's put it all into one patch as per usual. Yup. The only reason I made the exception this time is because the diffs for the tests were dependent on another non-committed patch (HDFS-6387). When I post the revised diffs, they'll of course be in one patch file. Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6389.001.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6569) OOB message can't be sent to the client when DataNode shuts down for upgrade
[ https://issues.apache.org/jira/browse/HDFS-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043915#comment-14043915 ] Brandon Li commented on HDFS-6569: -- The current code looks good logically and it tries not closing streams before the OOB is sent. I think problem is triggered by the NIO implementation. When DataNode is shutdown for restart, it interrupts all the DataXceiver threads. The NIO channel in NioInetPeer are bound to these threads doing the block receiving. If these threads are interrupted, the stream / channel is closed due to IO safety issues. So once the DataXceiver thread is interrupted, rarely the OOB can be sent before NIO channel is closed automatically. One possible fix is to send OOB message before interrupting DataXceiver threads. Thoughts? OOB message can't be sent to the client when DataNode shuts down for upgrade Key: HDFS-6569 URL: https://issues.apache.org/jira/browse/HDFS-6569 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.4.0 Reporter: Brandon Li The socket is closed too early before the OOB message can be sent to client, which causes the write pipeline failure. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6595: -- Component/s: balancer Priority: Minor (was: Major) Hadoop Flags: Reviewed +1 patch looks good. Configure the maximum threads allowed for balancing on datanodes Key: HDFS-6595 URL: https://issues.apache.org/jira/browse/HDFS-6595 Project: Hadoop HDFS Issue Type: Improvement Components: balancer, datanode Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Attachments: HDFS-6595.patch, HDFS-6595.patch Currently datanode allows a max of 5 threads to be used for balancing. In some cases, , it may make sense to use a different number of threads to the purpose of moving. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze updated HDFS-6595: -- Resolution: Fixed Fix Version/s: 2.5.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Benoy! Configure the maximum threads allowed for balancing on datanodes Key: HDFS-6595 URL: https://issues.apache.org/jira/browse/HDFS-6595 Project: Hadoop HDFS Issue Type: Improvement Components: balancer, datanode Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6595.patch, HDFS-6595.patch Currently datanode allows a max of 5 threads to be used for balancing. In some cases, , it may make sense to use a different number of threads to the purpose of moving. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6595) Configure the maximum threads allowed for balancing on datanodes
[ https://issues.apache.org/jira/browse/HDFS-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043953#comment-14043953 ] Hudson commented on HDFS-6595: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5779 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5779/]) HDFS-6595. Allow the maximum threads for balancing on datanodes to be configurable. Contributed by Benoy Antony (szetszwo: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1605565) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java Configure the maximum threads allowed for balancing on datanodes Key: HDFS-6595 URL: https://issues.apache.org/jira/browse/HDFS-6595 Project: Hadoop HDFS Issue Type: Improvement Components: balancer, datanode Reporter: Benoy Antony Assignee: Benoy Antony Priority: Minor Fix For: 2.5.0 Attachments: HDFS-6595.patch, HDFS-6595.patch Currently datanode allows a max of 5 threads to be used for balancing. In some cases, , it may make sense to use a different number of threads to the purpose of moving. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5546) race condition crashes hadoop ls -R when directories are moved/removed
[ https://issues.apache.org/jira/browse/HDFS-5546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043956#comment-14043956 ] Colin Patrick McCabe commented on HDFS-5546: I agree with a lot of the stuff that's been presented, but I also think our behavior should be consistent beween {{ls /a1/b /a2/b}} and {{ls /a\{1,2\}/b}}, and right now I can't see a good way to achieve that if we catch IOE (since the globber does not catch IOE) On the other hand, if we catch FNF and continue if a top-level directory disappears on us, then we are making things more consistent, since the globber catches and ignores IOEs (when dealing with globs). bq. Colin Patrick McCabe shouldn't the globStatus() be out of scope for this JIRA? Maybe we should open another related JIRA? I'm not sure how the globber would report IOE other than throwing it. We'd have to return a list of {{OptionFileStatus, IOException}} or something? It doesn't seem like the kind of change that could be made compatibly, since we'd need a new interface. So overall I would lean towards just catching FNF at the top-level, like the earlier patch did. And maybe revisiting this later if we have better ideas about how to handle this in the globber as well. [~daryn], [~eddyxu], does that make sense? Or am I trying too hard to be consistent? :) race condition crashes hadoop ls -R when directories are moved/removed Key: HDFS-5546 URL: https://issues.apache.org/jira/browse/HDFS-5546 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.2.0 Reporter: Colin Patrick McCabe Assignee: Lei (Eddy) Xu Priority: Minor Fix For: 3.0.0 Attachments: HDFS-5546.1.patch, HDFS-5546.2.000.patch, HDFS-5546.2.001.patch, HDFS-5546.2.002.patch, HDFS-5546.2.003.patch, HDFS-5546.2.004.patch This seems to be a rare race condition where we have a sequence of events like this: 1. org.apache.hadoop.shell.Ls calls DFS#getFileStatus on directory D. 2. someone deletes or moves directory D 3. org.apache.hadoop.shell.Ls calls PathData#getDirectoryContents(D), which calls DFS#listStatus(D). This throws FileNotFoundException. 4. ls command terminates with FNF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043975#comment-14043975 ] Aaron T. Myers commented on HDFS-6134: -- bq. Aaron, I can't do a meeting tomorrow afternoon. How about tomorrow morning? Say 10am-noon? Sounds good. Here's the address of Cloudera's SF Office: 433 California Street, Floor 6 San Francisco, CA 94104 I'll post the remote meeting details later today on this JIRA once I get those figured out. See you tomorrow! Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing
[ https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043993#comment-14043993 ] Kihwal Lee commented on HDFS-6602: -- Block report processing is actually okay. All types of block report processing goes through {{BlockManager#processReportedBlock()}} and any report from future is queued. It is {{delete()}} that causes this queue to be populated. After collecting all blocks to be invalidated, {{BlockManager#removeBlock()}} is called, which calls {{addToInvalidates()}}. If NN is in standby, {{addToInvalidates()}} should not be called. PendingDeletionBlocks on SBN keeps increasing - Key: HDFS-6602 URL: https://issues.apache.org/jira/browse/HDFS-6602 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Priority: Critical PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It means this data structure is populated, but IBR (incremental block reports) do not cause deleted blocks to be removed from it. As a result, the heap usage keeps increasing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (HDFS-6602) PendingDeletionBlocks on SBN keeps increasing
[ https://issues.apache.org/jira/browse/HDFS-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee resolved HDFS-6602. -- Resolution: Duplicate Assignee: Kihwal Lee It's already fixed by HDFS-6424! PendingDeletionBlocks on SBN keeps increasing - Key: HDFS-6602 URL: https://issues.apache.org/jira/browse/HDFS-6602 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.5.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical PendingDeletionBlocks is from BlockManager.invalidateBlocks.numBlocks(). It means this data structure is populated, but IBR (incremental block reports) do not cause deleted blocks to be removed from it. As a result, the heap usage keeps increasing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044008#comment-14044008 ] Hadoop QA commented on HDFS-2856: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12652461/HDFS-2856.5.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/7234//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7234//console This message is automatically generated. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Affects Versions: 3.0.0, 2.4.0 Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch, HDFS-2856.5.patch, HDFS-2856.prototype.patch Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6560) Byte array native checksumming on DN side
[ https://issues.apache.org/jira/browse/HDFS-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044060#comment-14044060 ] Tsz Wo Nicholas Sze commented on HDFS-6560: --- For Java 7, NativeCrc32 with direct buffer is faster than zip.CRC32 for byte-per-crc 512 but slower than it for byte-per-crc 512. For byte-per-crc == 512 (which is an important case), their performances are similar. {noformat} java.version = 1.7.0_60 java.runtime.name = Java(TM) SE Runtime Environment java.runtime.version = 1.7.0_60-b19 java.vm.version = 24.60-b09 java.vm.vendor = Oracle Corporation java.vm.name = Java HotSpot(TM) 64-Bit Server VM java.vm.specification.version = 1.7 java.specification.version = 1.7 os.arch = x86_64 os.name = Mac OS X os.version = 10.9.3 DATA_LENGTH = 67108864 TRIALS = 10 {noformat} Performance Table (bpc is byte-per-crc in MB/sec; #T = #Theads) | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | |32 | 1 | 237.7 | 789.8 | 232.3% |1624.2 | 583.3% | 105.7% | |32 | 2 | 207.6 | 604.5 | 191.2% |1608.3 | 674.8% | 166.1% | |32 | 4 | 179.8 | 609.8 | 239.2% |1387.8 | 671.9% | 127.6% | |32 | 8 | 163.4 | 356.8 | 118.3% | 910.4 | 457.1% | 155.1% | |32 | 16 | 81.6 | 183.7 | 125.0% | 490.9 | 501.4% | 167.3% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | |64 | 1 | 423.7 |1027.0 | 142.4% |1654.4 | 290.4% | 61.1% | |64 | 2 | 417.7 |1031.8 | 147.0% |1640.1 | 292.7% | 59.0% | |64 | 4 | 366.0 | 693.8 | 89.5% |1381.7 | 277.5% | 99.2% | |64 | 8 | 280.2 | 443.5 | 58.3% |1046.8 | 273.5% | 136.0% | |64 | 16 | 143.3 | 233.0 | 62.6% | 556.3 | 288.2% | 138.8% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 128 | 1 | 716.1 |1229.9 | 71.7% |1628.6 | 127.4% | 32.4% | | 128 | 2 | 703.0 |1221.4 | 73.7% |1610.0 | 129.0% | 31.8% | | 128 | 4 | 708.1 | 998.7 | 41.0% |1408.1 | 98.8% | 41.0% | | 128 | 8 | 503.3 | 583.7 | 16.0% |1059.4 | 110.5% | 81.5% | | 128 | 16 | 259.6 | 316.4 | 21.9% | 610.3 | 135.2% | 92.9% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 256 | 1 |1217.4 |1346.6 | 10.6% |1554.3 | 27.7% | 15.4% | | 256 | 2 |1186.3 |1339.0 | 12.9% |1556.6 | 31.2% | 16.3% | | 256 | 4 |1094.9 |1102.9 | 0.7% |1389.3 | 26.9% | 26.0% | | 256 | 8 | 768.3 | 656.8 | -14.5% |1109.4 | 44.4% | 68.9% | | 256 | 16 | 394.6 | 358.7 | -9.1% | 597.8 | 51.5% | 66.7% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 512 | 1 |1632.0 |1391.3 | -14.7% |1548.1 | -5.1% | 11.3% | | 512 | 2 |1608.9 |1377.8 | -14.4% |1550.1 | -3.7% | 12.5% | | 512 | 4 |1465.2 |1092.6 | -25.4% |1420.8 | -3.0% | 30.0% | | 512 | 8 |1027.7 | 721.7 | -29.8% |1124.4 | 9.4% | 55.8% | | 512 | 16 | 551.6 | 397.9 | -27.9% | 628.2 | 13.9% | 57.9% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 1024 | 1 |1980.3 |1411.7 | -28.7% |1570.7 | -20.7% | 11.3% | | 1024 | 2 |1909.4 |1396.7 | -26.9% |1534.7 | -19.6% | 9.9% | | 1024 | 4 |1747.4 |1159.9 | -33.6% |1426.2 | -18.4% | 23.0% | | 1024 | 8 |1245.6 | 752.7 | -39.6% |1149.8 | -7.7% | 52.8% | | 1024 | 16 | 660.6 | 380.2 | -42.4% | 618.1 | -6.4% | 62.6% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 2048 | 1 |2140.4 |1390.2 | -35.0% |1570.3 | -26.6% | 13.0% | | 2048 | 2 |2126.5 |1374.5 | -35.4% |1538.9 | -27.6% | 12.0% | | 2048 | 4 |1769.0 |1132.9 | -36.0% |1411.5 | -20.2% | 24.6% | | 2048 | 8 |1358.6 | 754.8 | -44.4% |1207.0 | -11.2% | 59.9% | | 2048 | 16 | 749.4 | 394.4 | -47.4% | 639.9 | -14.6% | 62.2% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 4096 | 1 |2325.5 |1427.0 | -38.6% |1531.2 | -34.2% | 7.3% | | 4096 | 2 |2199.7 |1375.1 | -37.5% |1524.4 | -30.7% | 10.9% | | 4096 | 4 |1927.3 |1103.8 | -42.7% |1412.7 | -26.7% | 28.0% | | 4096 | 8 |1427.1 | 773.2 | -45.8% |1206.2 | -15.5% | 56.0% | | 4096 | 16 | 761.0 | 401.3 | -47.3% | 632.6 | -16.9% | 57.6% | | bpc | #T || Zip || PureJava | % diff || Native | % diff | % diff | | 8192 | 1 |2364.7 |1431.6 | -39.5% |1566.2 | -33.8%
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044087#comment-14044087 ] Aaron T. Myers commented on HDFS-6134: -- Here's the WebEx information for those who are planning on joining remotely tomorrow from 10am-noon Pacific Time: {noformat} --- To start or join the online meeting --- Go to https://cloudera.webex.com/cloudera/j.php?MTID=me67e0b50829b1dc39077ac5ca323038a --- Audio Only conference information --- Call-in toll number (US/Canada): 1-650-479-3208 Access code:627 373 149 Global call-in numbers: https://cloudera.webex.com/cloudera/globalcallin.php?serviceType=MCED=321024932tollFree=0 {noformat} Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044093#comment-14044093 ] stack commented on HDFS-6591: - Nice test and added metrics [~xieliang007] Looks good on first pass. Let me give it another pass while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044109#comment-14044109 ] Owen O'Malley commented on HDFS-6134: - Any chance for the PA office? Otherwise I'll be dialing in. Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6134) Transparent data at rest encryption
[ https://issues.apache.org/jira/browse/HDFS-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044124#comment-14044124 ] Aaron T. Myers commented on HDFS-6134: -- Unfortunately not, all of Tucu, Andrew, Charlie, Colin, Todd, and I are all based out of the SF office and it's quite a hike for us to get down there. Sure you can't come up to SF? I'll buy you lunch after the meeting. :) Transparent data at rest encryption --- Key: HDFS-6134 URL: https://issues.apache.org/jira/browse/HDFS-6134 Project: Hadoop HDFS Issue Type: New Feature Components: security Affects Versions: 2.3.0 Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFSDataatRestEncryptionProposal_obsolete.pdf, HDFSEncryptionConceptualDesignProposal-2014-06-20.pdf Because of privacy and security regulations, for many industries, sensitive data at rest must be in encrypted form. For example: the healthcare industry (HIPAA regulations), the card payment industry (PCI DSS regulations) or the US government (FISMA regulations). This JIRA aims to provide a mechanism to encrypt HDFS data at rest that can be used transparently by any application accessing HDFS via Hadoop Filesystem Java API, Hadoop libhdfs C library, or WebHDFS REST API. The resulting implementation should be able to be used in compliance with different regulation requirements. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6389) Rename restrictions for encryption zones
[ https://issues.apache.org/jira/browse/HDFS-6389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Lamb updated HDFS-6389: --- Attachment: HDFS-6389.002.patch [~cmccabe], Thanks for the review. My intent of putting it in FSN was to have the code fail sooner rather than after both the FSN and FSD locks were taken, but longer term I agree that it should be moved down to FSD from FSN. The revised diffs are the .002 version of the file. Rename restrictions for encryption zones Key: HDFS-6389 URL: https://issues.apache.org/jira/browse/HDFS-6389 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode, security Reporter: Alejandro Abdelnur Assignee: Charles Lamb Attachments: HDFS-6389.001.patch, HDFS-6389.002.patch, HDFS-6389.tests.patch Files and directories should not be moved in or out an encryption zone. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044172#comment-14044172 ] Akira AJISAKA commented on HDFS-6591: - Nice fix to me. In TestPread.java, {code} } isHedgedRead = true; } {code} Would you please create {{@Before}} class and initialize variables there instead of setting at the last of {{@Test}} class like above? Minor nits: 1. In DFSInputStream.java:1107, {code} FutureByteBuffer future = null; {code} Now that {{future}} is not used in the else clause, would you move the declaration into the try-catch clause? 2. There is a trailing white space in {code} +CompletionServiceByteBuffer hedgedService = {code} while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-5369) Support negative caching of user-group mapping
[ https://issues.apache.org/jira/browse/HDFS-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-5369: Attachment: HDFS-5369.000.patch This patch re-enables the negative cache behavior in user-group mapping, so that it would not result _*long-time*_ frequently retries when the user/group resolution service has transient issues. The difference between this patch and HADOOP-8088 is that this patch adds another timeout for negative cached items. Thus it should be able to differentiate the expiring times for normal case and negative cached case. It also reduces the error message generated from Ldap and Shell based GroupsMappings. It would be great to see whether this patch can fit into different cases from [~andrew.wang], [~vinayrpet] and [~kihwal]. All feedbacks are wellcome :) Support negative caching of user-group mapping -- Key: HDFS-5369 URL: https://issues.apache.org/jira/browse/HDFS-5369 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: Andrew Wang Attachments: HDFS-5369.000.patch We've seen a situation at a couple of our customers where interactions from an unknown user leads to a high-rate of group mapping calls. In one case, this was happening at a rate of 450 calls per second with the shell-based group mapping, enough to severely impact overall namenode performance and also leading to large amounts of log spam (prints a stack trace each time). Let's consider negative caching of group mapping, as well as quashing the rate of this log message. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5369) Support negative caching of user-group mapping
[ https://issues.apache.org/jira/browse/HDFS-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044201#comment-14044201 ] Lei (Eddy) Xu commented on HDFS-5369: - [~andrew.wang] Thanks for your comments and pointing to the issues raised from HADOOP-8088. I would consider this patch as reference implementation, so that it has not yet addressed the error handling of a transient error. I will to address it as well as SLAs after getting some inputs. Moreover, the default value for negative cache timeout is still not clear to me. It might need more field data to support an appropriate timeout value here. Support negative caching of user-group mapping -- Key: HDFS-5369 URL: https://issues.apache.org/jira/browse/HDFS-5369 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0 Reporter: Andrew Wang Attachments: HDFS-5369.000.patch We've seen a situation at a couple of our customers where interactions from an unknown user leads to a high-rate of group mapping calls. In one case, this was happening at a rate of 450 calls per second with the shell-based group mapping, enough to severely impact overall namenode performance and also leading to large amounts of log spam (prints a stack trace each time). Let's consider negative caching of group mapping, as well as quashing the rate of this log message. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS
[ https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044254#comment-14044254 ] Aaron T. Myers commented on HDFS-5321: -- I very much disagree with the notion of private configurations. To my knowledge we've never made such a distinction, and if we have been then it certainly should have been called out more explicitly for each individual setting than the mere absence of them from {{hdfs-default.xml}}. bq. Since HDFS maintains no compatibility guarantees for private configurations, it should be okay to include this in minor releases. Where are you concluding this from? Our compatibility guide makes no mention of it, or the concept of private configurations at all: http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html I think we should seriously consider reverting this change. The stated benefits seem quite minor, these conf settings have never been deprecated properly using DeprecationDelta (meaning users have never seen printed warnings about the deprecation), and this is a clearly incompatible change that has the potential to break existing applications as-written. Clean up the HTTP-related configuration in HDFS --- Key: HDFS-5321 URL: https://issues.apache.org/jira/browse/HDFS-5321 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch Currently there are multiple configuration keys that control the ports that the NameNode and DataNode listen to, and the default ports that the hftp/webhdfs clients are connecting to. Below is a quick summary of these configuration: || Keys || Description || | dfs.namenode.http-address | The address that the namenode http server binds to | | dfs.namenode.https-address | The address that the namenode https server binds to | | dfs.http.port | The default port that the hftp/webhdfs client use to connect to the remote server| | dfs.https.port | The default port that the hsftp client use to connect to the remote server| I propose to deprecate dfs.http.port and dfs.https.port to avoid potential confusions (e.g., HDFS-5316). Note that this removes no functionality, since the users can specify ports in hftp / webhdfs URLs when they need to connect to HDFS servers with non-default ports. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-5321) Clean up the HTTP-related configuration in HDFS
[ https://issues.apache.org/jira/browse/HDFS-5321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044276#comment-14044276 ] Haohui Mai commented on HDFS-5321: -- Points taken. I'm okay to putting these two configurations back to branch-2, but it looks to me that it requires more work other than simply reverting the changes. Can you please create a jira for this task? Clean up the HTTP-related configuration in HDFS --- Key: HDFS-5321 URL: https://issues.apache.org/jira/browse/HDFS-5321 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Haohui Mai Assignee: Haohui Mai Fix For: 2.4.0 Attachments: HDFS-5321.000.patch, HDFS-5321.001.patch Currently there are multiple configuration keys that control the ports that the NameNode and DataNode listen to, and the default ports that the hftp/webhdfs clients are connecting to. Below is a quick summary of these configuration: || Keys || Description || | dfs.namenode.http-address | The address that the namenode http server binds to | | dfs.namenode.https-address | The address that the namenode https server binds to | | dfs.http.port | The default port that the hftp/webhdfs client use to connect to the remote server| | dfs.https.port | The default port that the hsftp client use to connect to the remote server| I propose to deprecate dfs.http.port and dfs.https.port to avoid potential confusions (e.g., HDFS-5316). Note that this removes no functionality, since the users can specify ports in hftp / webhdfs URLs when they need to connect to HDFS servers with non-default ports. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044321#comment-14044321 ] Liang Xie commented on HDFS-6591: - Attached v2 should address the above comments. Thank you [~stack] [~ajisakaa] for reviewing ! while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HDFS-6591) while loop is executed tens of thousands of times in Hedged Read
[ https://issues.apache.org/jira/browse/HDFS-6591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang Xie updated HDFS-6591: Attachment: HDFS-6591-v2.txt while loop is executed tens of thousands of times in Hedged Read -- Key: HDFS-6591 URL: https://issues.apache.org/jira/browse/HDFS-6591 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: 2.4.0 Reporter: LiuLei Assignee: Liang Xie Attachments: HDFS-6591-v2.txt, HDFS-6591.txt, LoopTooManyTimesTestCase.patch I download hadoop-2.4.1-rc1 code from http://people.apache.org/~acmurthy/hadoop-2.4.1-rc1/, I test the Hedged Read. I find the while loop in hedgedFetchBlockByteRange method is executed tens of thousands of times. -- This message was sent by Atlassian JIRA (v6.2#6252)