[jira] [Updated] (HDFS-5157) Add StorageType to FsVolume
[ https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-5157: - Description: To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. (was: Datanode should allow should choosing a target Storage or target Storage Type as a parameter when creating a new block. Currently there are two ways in which the target volume is chosen (via {{VolumeChoosingPolicy#chooseVolume}}. # AvailableSpaceVolumeChoosingPolicy # RoundRobinVolumeChoosingPolicy BlockReceiver and receiveBlock should also accept a new parameter for target storage or storage type.) Summary: Add StorageType to FsVolume (was: Datanode should allow choosing the target storage) Add StorageType to FsVolume --- Key: HDFS-5157 URL: https://issues.apache.org/jira/browse/HDFS-5157 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Arpit Agarwal Assignee: Junping Du Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch, HDFS-5157-v3.patch To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5157) Add StorageType to FsVolume
[ https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764032#comment-13764032 ] Junping Du commented on HDFS-5157: -- Updated. Thanks for your review. Nicholas! Add StorageType to FsVolume --- Key: HDFS-5157 URL: https://issues.apache.org/jira/browse/HDFS-5157 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Arpit Agarwal Assignee: Junping Du Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch, HDFS-5157-v3.patch To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5157) Add StorageType to FsVolume
[ https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-5157: - Hadoop Flags: Reviewed +1 patch looks good. Add StorageType to FsVolume --- Key: HDFS-5157 URL: https://issues.apache.org/jira/browse/HDFS-5157 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Arpit Agarwal Assignee: Junping Du Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch, HDFS-5157-v3.patch To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-5157) Add StorageType to FsVolume
[ https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-5157. -- Resolution: Fixed Fix Version/s: Heterogeneous Storage (HDFS-2832) I have committed this. Thanks, Junping! Add StorageType to FsVolume --- Key: HDFS-5157 URL: https://issues.apache.org/jira/browse/HDFS-5157 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Arpit Agarwal Assignee: Junping Du Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch, HDFS-5157-v3.patch To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764061#comment-13764061 ] Aaron T. Myers commented on HDFS-5158: -- bq. CachedPath would be a misleading name, since we may or may not actually be able to cache the path entries in PathCache. Resources aren't infinite. Bear in mind that there are going to be other caches that don't operate by path name-- one example is the LRU cache we've talked about. PathCache, as well as PathCacheDirective, PathCacheEntry, etc. are named the way they are to distinguish them from the (future) LruCacheDirective, etc. classes which don't exist yet. Even with this justification and the context of the future Lru* classes, there's just no way you can get away from people interpreting path cache to mean a cache of paths, which I find to be vastly more misleading/confusing than CachedPath would be. How do you feel about CachePath (no 'd') as I also suggested? That doesn't necessarily imply that anything is already cached, and also appears to work with the future classes you mentioned here, e.g. CacheLruDirective, CacheLruEntry, etc. bq. For example, Impala or Hive may want to add many cache directives at once. I'm a tad skeptical this will in fact be the case, given that the caching directive can potentially provide a directory as the path, but it seems fairly harmless to leave in the RPCs that take lists as arguments. bq. We discussed this on HDFS-5052. The short summary is that paths don't uniquely identify path cache directives. You can have multiple directives that apply to the same path. Could you even have multiple directives for the same path within a single pool? Or would the (pool, path) pair uniquely identify the cache directive? bq. I did not change the prefix in this patch. Does it make sense to put the prefix change stuff in another JIRA? It seems like it will be a bigger effort, if we're moving the -addCachePool, etc. commands as well. I'd personally do it in this JIRA; it really shouldn't be that much work. But, if you feel strongly about it, you can do it in a separate JIRA if you want. add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5031) BlockScanner scans the block multiple times and on restart scans everything
[ https://issues.apache.org/jira/browse/HDFS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764063#comment-13764063 ] Vinay commented on HDFS-5031: - bq. I ran TestDatanodeBlockScanner#testDuplicateScans without the rest of the code changes and it continues to pass. Do you see the same? Yes. I also observed yesterday. I had missed one assertion. Will be updated in upcoming patch bq. I did not understand how the isNewPeriod check works. I will continue to take a look but meanwhile if someone more familiar with this code wants to chime in please do so. {{processedBlocks}} is getting reset for every log roll, but {{bytesLeft}} is getting reset only for every {{startNewPeriod()}}, so on every log roll unnecessory {{bytesLeft}} was getting decremented in {{assignInitialVerificationTimes()}} which was resulting in negative values of bytesLeft. Due to this scanning was returning from {{workRemainingInCurrentPeriod()}} without scanning latest blocks. We should decrement it only once after starting the new period. bq. BlockScanInfo#equals looks redundant now. Can we just remove it? Yes, I will remove in next patch. bq. In Reader#next, should the assignment to lastReadFile happen after the call to readNext? Since {{Reader#next}} is not actually reading again and returning. Its returning previously read line only. So assignment of {{lastReadFile }} before {{readNext}} is correct. BlockScanner scans the block multiple times and on restart scans everything --- Key: HDFS-5031 URL: https://issues.apache.org/jira/browse/HDFS-5031 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinay Assignee: Vinay Attachments: HDFS-5031.patch, HDFS-5031.patch BlockScanner scans the block twice, also on restart of datanode scans everything. Steps: 1. Write blocks with interval of more than 5 seconds. write new block on completion of scan for written block. Each time datanode scans new block, it also scans, previous block which is already scanned. Now after restart, datanode scans all blocks again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5157) Add StorageType to FsVolume
[ https://issues.apache.org/jira/browse/HDFS-5157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764062#comment-13764062 ] Junping Du commented on HDFS-5157: -- Thanks Nicholas for review and comments! Add StorageType to FsVolume --- Key: HDFS-5157 URL: https://issues.apache.org/jira/browse/HDFS-5157 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Arpit Agarwal Assignee: Junping Du Fix For: Heterogeneous Storage (HDFS-2832) Attachments: HDFS-5157-v1.patch, HDFS-5157-v2.patch, HDFS-5157-v3.patch To support heterogeneous storage, Datanode should aware (and manage later) different storage types of each volume. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5031) BlockScanner scans the block multiple times and on restart scans everything
[ https://issues.apache.org/jira/browse/HDFS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HDFS-5031: Attachment: HDFS-5031.patch Updated patch with comments BlockScanner scans the block multiple times and on restart scans everything --- Key: HDFS-5031 URL: https://issues.apache.org/jira/browse/HDFS-5031 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinay Assignee: Vinay Attachments: HDFS-5031.patch, HDFS-5031.patch BlockScanner scans the block twice, also on restart of datanode scans everything. Steps: 1. Write blocks with interval of more than 5 seconds. write new block on completion of scan for written block. Each time datanode scans new block, it also scans, previous block which is already scanned. Now after restart, datanode scans all blocks again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5031) BlockScanner scans the block multiple times and on restart scans everything
[ https://issues.apache.org/jira/browse/HDFS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764148#comment-13764148 ] Hadoop QA commented on HDFS-5031: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602541/HDFS-5031.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4954//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4954//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4954//console This message is automatically generated. BlockScanner scans the block multiple times and on restart scans everything --- Key: HDFS-5031 URL: https://issues.apache.org/jira/browse/HDFS-5031 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinay Assignee: Vinay Attachments: HDFS-5031.patch, HDFS-5031.patch BlockScanner scans the block twice, also on restart of datanode scans everything. Steps: 1. Write blocks with interval of more than 5 seconds. write new block on completion of scan for written block. Each time datanode scans new block, it also scans, previous block which is already scanned. Now after restart, datanode scans all blocks again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5031) BlockScanner scans the block multiple times and on restart scans everything
[ https://issues.apache.org/jira/browse/HDFS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764158#comment-13764158 ] Vinay commented on HDFS-5031: - {quote}BlockScanInfo#equals looks redundant now. Can we just remove it? Yes, I will remove in next patch{quote} Find bug is due to this.. it seems overriding equals() even though redundant, is necessary. Any thoughts.? BlockScanner scans the block multiple times and on restart scans everything --- Key: HDFS-5031 URL: https://issues.apache.org/jira/browse/HDFS-5031 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinay Assignee: Vinay Attachments: HDFS-5031.patch, HDFS-5031.patch BlockScanner scans the block twice, also on restart of datanode scans everything. Steps: 1. Write blocks with interval of more than 5 seconds. write new block on completion of scan for written block. Each time datanode scans new block, it also scans, previous block which is already scanned. Now after restart, datanode scans all blocks again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5085) Refactor o.a.h.nfs to support different types of authentications
[ https://issues.apache.org/jira/browse/HDFS-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764204#comment-13764204 ] Hudson commented on HDFS-5085: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #329 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/329/]) HDFS-5085. Refactor o.a.h.nfs to support different types of authentications. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521601) * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountResponse.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/IdUserGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Interface.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcDeniedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/XDR.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Credentials.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SysSecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Verifier.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/security/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestCredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java *
[jira] [Commented] (HDFS-5085) Refactor o.a.h.nfs to support different types of authentications
[ https://issues.apache.org/jira/browse/HDFS-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764295#comment-13764295 ] Hudson commented on HDFS-5085: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1519 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1519/]) HDFS-5085. Refactor o.a.h.nfs to support different types of authentications. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521601) * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountResponse.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/IdUserGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Interface.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcDeniedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/XDR.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Credentials.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SysSecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Verifier.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/security/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestCredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java *
[jira] [Commented] (HDFS-5183) Combine ReplicaPlacementPolicy with VolumeChoosingPolicy together to have a global view in choosing DN storage for replica.
[ https://issues.apache.org/jira/browse/HDFS-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764311#comment-13764311 ] Eric Sirianni commented on HDFS-5183: - Copying over my feedback from HDFS-5157: bq. I would favor a model like #2 but allow for the DataNode to override the placement decision (where the override was likely only done in exceptional circumstances). This seems consistent with the overarching design of HDFS to allow for loose synchronization in replica maps between the NameNode and DataNode (re-synchronized periodically by block reports). In particular, if the DataNode cannot honor the specific StorageID chosen by the NameNode (but can still honor the write), I believe this should *not* be an error condition. Do you all agree? Combine ReplicaPlacementPolicy with VolumeChoosingPolicy together to have a global view in choosing DN storage for replica. --- Key: HDFS-5183 URL: https://issues.apache.org/jira/browse/HDFS-5183 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode, performance Affects Versions: Heterogeneous Storage (HDFS-2832) Reporter: Junping Du Per discussion in HDFS-5157, There are two different ways to handle BlockPlacementPolicy and ReplicaChoosingPolicy in case of multiple storage types: 1. Client specifies the required storage type when calling addBlock(..) to NN. BlockPlacementPolicy in NN chooses a set of datanodes accounting for the storage type. Then, client passes the required storage type to the datanode set and each datanode chooses a particular storage using a VolumeChoosingPolicy. 2. Same as before, client specifies the required storage type when calling addBlock(..) to NN. Now, BlockPlacementPolicy in NN chooses a set of storages (instead of datanodes). Then, client writes to the corresponding storages. VolumeChoosingPolicy is no longer needed and it should be removed. We think #2 is more powerful as it will bring global view to volume choosing or bring storage status into consideration in replica choosing, so we propose to combine two polices together. One concern here is it may increase the load of NameNode as previously volume choosing is decided by DN. We may verify it later (that's why I put performance in component). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5184) BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true
[ https://issues.apache.org/jira/browse/HDFS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Vujic updated HDFS-5184: --- Description: If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node); } numOfReplicas = totalReplicasExpected - results.size(); return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize, maxNodesPerRack, results, false); } So, all excluded nodes from the first attempt which are neither in oldExcludedNodes nor in results will be ignored and the second invocation of chooseTarget will use an incomplete set of excluded nodes. For example, if we have next topology: dn1 - /d1/r1/n1 dn2 - /d1/r1/n1 dn3 - /d1/r1/n2 dn4 - /d1/r1/n2 and if we want to choose 3 targets with avoidStaleNodes=true then in the first attempt we will choose 2 targets since we have only two node groups. Let's say we choose dn1 and dn3. Then, we will add dn1 and dn2 in the oldExcudedNodes and use that set of excluded nodes in the second attempt. This set of excluded nodes is incomplete and allows us to select dn2 and dn4 in the second attempt which should not be selected due to node group awareness but it is happening in the current code! Repro: - add CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true); to TestReplicationPolicyWithNodeGroup. - testChooseMoreTargetsThanNodeGroups() should fail. was: If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node); } numOfReplicas = totalReplicasExpected - results.size(); return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize, maxNodesPerRack, results, false); } } So, all excluded nodes from the first attempt which are neigher in oldExcludedNodes nor in results will be ignored and the second invocation of chooseTarget will use an incomplete set of excluded nodes. For example, if we have next topology: dn1 - /d1/r1/n1 dn2 - /d1/r1/n1 dn3 - /d1/r1/n2 dn4 - /d1/r1/n2 and if we want to choose 3 targets with avoidStaleNodes=true then in the first attempt we will choose 2 targets since we have only two node groups. Let's say we choose dn1 and dn3. Then, we will add dn1 and dn2 in the oldExcudedNodes and use that set of excluded nodes in the second attempt. This set of excluded nodes is incomplete and allows us to select dn2 and dn4 which should not be selected due to node group awareness but it is happening in the current code! Quick repro: - add CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true); to TestReplicationPolicyWithNodeGroup. - testChooseMoreTargetsThanNodeGroups() should fail. BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true Key: HDFS-5184 URL: https://issues.apache.org/jira/browse/HDFS-5184 Project: Hadoop HDFS Issue Type: Bug Reporter: Nikola Vujic Priority: Minor If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node);
[jira] [Updated] (HDFS-5184) BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true
[ https://issues.apache.org/jira/browse/HDFS-5184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nikola Vujic updated HDFS-5184: --- Description: If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node); } numOfReplicas = totalReplicasExpected - results.size(); return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize, maxNodesPerRack, results, false); } } So, all excluded nodes from the first attempt which are neither in oldExcludedNodes nor in results will be ignored and the second invocation of chooseTarget will use an incomplete set of excluded nodes. For example, if we have next topology: dn1 - /d1/r1/n1 dn2 - /d1/r1/n1 dn3 - /d1/r1/n2 dn4 - /d1/r1/n2 and if we want to choose 3 targets with avoidStaleNodes=true then in the first attempt we will choose 2 targets since we have only two node groups. Let's say we choose dn1 and dn3. Then, we will add dn1 and dn2 in the oldExcudedNodes and use that set of excluded nodes in the second attempt. This set of excluded nodes is incomplete and allows us to select dn2 and dn4 in the second attempt which should not be selected due to node group awareness but it is happening in the current code! Repro: - add CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true); to TestReplicationPolicyWithNodeGroup. - testChooseMoreTargetsThanNodeGroups() should fail. was: If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node); } numOfReplicas = totalReplicasExpected - results.size(); return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize, maxNodesPerRack, results, false); } So, all excluded nodes from the first attempt which are neither in oldExcludedNodes nor in results will be ignored and the second invocation of chooseTarget will use an incomplete set of excluded nodes. For example, if we have next topology: dn1 - /d1/r1/n1 dn2 - /d1/r1/n1 dn3 - /d1/r1/n2 dn4 - /d1/r1/n2 and if we want to choose 3 targets with avoidStaleNodes=true then in the first attempt we will choose 2 targets since we have only two node groups. Let's say we choose dn1 and dn3. Then, we will add dn1 and dn2 in the oldExcudedNodes and use that set of excluded nodes in the second attempt. This set of excluded nodes is incomplete and allows us to select dn2 and dn4 in the second attempt which should not be selected due to node group awareness but it is happening in the current code! Repro: - add CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true); to TestReplicationPolicyWithNodeGroup. - testChooseMoreTargetsThanNodeGroups() should fail. BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true Key: HDFS-5184 URL: https://issues.apache.org/jira/browse/HDFS-5184 Project: Hadoop HDFS Issue Type: Bug Reporter: Nikola Vujic Priority: Minor If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node :
[jira] [Created] (HDFS-5184) BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true
Nikola Vujic created HDFS-5184: -- Summary: BlockPlacementPolicyWithNodeGroup does not work correct when avoidStaleNodes is true Key: HDFS-5184 URL: https://issues.apache.org/jira/browse/HDFS-5184 Project: Hadoop HDFS Issue Type: Bug Reporter: Nikola Vujic Priority: Minor If avoidStaleNodes is true then choosing targets is potentially done in two attempts. If we don't find enough targets to place replicas in the first attempt then second attempt is invoked with the aim to use stale nodes in order to find the remaining targets. This second attempt breaks node group rule of not having two replicas in the same node group. Invocation of the second attempt looks like this: DatanodeDescriptor choseTarget(excludeNodes,...) { oldExcludedNodes=new HashMapNode, Node(excludedNodes); // first attempt // if we don't find enough targets then if (avoidStaleNodes) { for (Node node : results) { oldExcludedNodes.put(node, node); } numOfReplicas = totalReplicasExpected - results.size(); return chooseTarget(numOfReplicas, writer, oldExcludedNodes, blocksize, maxNodesPerRack, results, false); } } So, all excluded nodes from the first attempt which are neigher in oldExcludedNodes nor in results will be ignored and the second invocation of chooseTarget will use an incomplete set of excluded nodes. For example, if we have next topology: dn1 - /d1/r1/n1 dn2 - /d1/r1/n1 dn3 - /d1/r1/n2 dn4 - /d1/r1/n2 and if we want to choose 3 targets with avoidStaleNodes=true then in the first attempt we will choose 2 targets since we have only two node groups. Let's say we choose dn1 and dn3. Then, we will add dn1 and dn2 in the oldExcudedNodes and use that set of excluded nodes in the second attempt. This set of excluded nodes is incomplete and allows us to select dn2 and dn4 which should not be selected due to node group awareness but it is happening in the current code! Quick repro: - add CONF.setBoolean(DFSConfigKeys.DFS_NAMENODE_AVOID_STALE_DATANODE_FOR_WRITE_KEY, true); to TestReplicationPolicyWithNodeGroup. - testChooseMoreTargetsThanNodeGroups() should fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5085) Refactor o.a.h.nfs to support different types of authentications
[ https://issues.apache.org/jira/browse/HDFS-5085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764357#comment-13764357 ] Hudson commented on HDFS-5085: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1545 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1545/]) HDFS-5085. Refactor o.a.h.nfs to support different types of authentications. Contributed by Jing Zhao. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1521601) * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/mount/MountResponse.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/IdUserGroup.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Constant.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/nfs3/Nfs3Interface.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/AccessPrivilege.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/nfs/security/NfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/RpcDeniedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/XDR.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Credentials.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/CredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/RpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/SysSecurityHandler.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/Verifier.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierGSS.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/oncrpc/security/VerifierNone.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/main/java/org/apache/hadoop/portmap/PortmapRequest.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/nfs/security/TestNfsExports.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAcceptedReply.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcAuthSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/TestRpcCall.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestCredentialsSys.java * /hadoop/common/trunk/hadoop-common-project/hadoop-nfs/src/test/java/org/apache/hadoop/oncrpc/security/TestRpcAuthInfo.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/mount/RpcProgramMountd.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-nfs/src/main/java/org/apache/hadoop/hdfs/nfs/nfs3/RpcProgramNfs3.java *
[jira] [Updated] (HDFS-2882) DN continues to start up, even if block pool fails to initialize
[ https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay updated HDFS-2882: Assignee: Vinay (was: Colin Patrick McCabe) Status: Patch Available (was: Open) DN continues to start up, even if block pool fails to initialize Key: HDFS-2882 URL: https://issues.apache.org/jira/browse/HDFS-2882 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Vinay Attachments: HDFS-2882.patch, hdfs-2882.txt I started a DN on a machine that was completely out of space on one of its drives. I saw the following: 2012-02-02 09:56:50,499 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-448349972-172.29.5.192-1323816762969 (storage id DS-507718931-172.29.5.194-11072-12978 42002148) service to styx01.sf.cloudera.com/172.29.5.192:8021 java.io.IOException: Mkdirs failed to create /data/1/scratch/todd/styx-datadir/current/BP-448349972-172.29.5.192-1323816762969/tmp at org.apache.hadoop.hdfs.server.datanode.FSDataset$BlockPoolSlice.init(FSDataset.java:335) but the DN continued to run, spewing NPEs when it tried to do block reports, etc. This was on the HDFS-1623 branch but may affect trunk as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5185) DN fails to startup if one of the data dir is full
Vinay created HDFS-5185: --- Summary: DN fails to startup if one of the data dir is full Key: HDFS-5185 URL: https://issues.apache.org/jira/browse/HDFS-5185 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Vinay Priority: Blocker DataNode fails to startup if one of the data dirs configured is out of space. fails with following exception {noformat}2013-09-11 17:48:43,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool registering (storage id DS-308316523-xx.xx.xx.xx-64015-1378896293604) service to /nn1:65110 java.io.IOException: Mkdirs failed to create /opt/nish/data/current/BP-123456-1234567/tmp at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.init(BlockPoolSlice.java:105) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:216) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.addBlockPool(FsVolumeList.java:155) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.addBlockPool(FsDatasetImpl.java:1593) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:834) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:311) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:217) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} It should continue to start-up with other data dirs available. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-5181) Fail-over support for HA cluster in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-5181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai reassigned HDFS-5181: Assignee: Haohui Mai Fail-over support for HA cluster in WebHDFS Key: HDFS-5181 URL: https://issues.apache.org/jira/browse/HDFS-5181 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Reporter: Haohui Mai Assignee: Haohui Mai HDFS-5122 only teaches WebHDFS client to recognize the logical name in HA clusters. The WebHDFS client should implement fail-over mechanisms in order to fully support HA clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2882) DN continues to start up, even if block pool fails to initialize
[ https://issues.apache.org/jira/browse/HDFS-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764520#comment-13764520 ] Hadoop QA commented on HDFS-2882: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12597670/HDFS-2882.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot org.apache.hadoop.hdfs.security.TestDelegationToken org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration org.apache.hadoop.hdfs.TestDFSRollback org.apache.hadoop.hdfs.TestDFSStorageStateRecovery org.apache.hadoop.hdfs.server.blockmanagement.TestOverReplicatedBlocks org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.datanode.TestHSync org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestDatanodeRestart org.apache.hadoop.hdfs.server.blockmanagement.TestHeartbeatHandling org.apache.hadoop.hdfs.server.datanode.TestMultipleNNDataBlockScanner org.apache.hadoop.hdfs.server.datanode.TestDiskError org.apache.hadoop.hdfs.TestFileCreation org.apache.hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations org.apache.hadoop.hdfs.server.datanode.TestTransferRbw org.apache.hadoop.hdfs.TestDFSStartupVersions org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks org.apache.hadoop.net.TestNetworkTopology org.apache.hadoop.hdfs.TestFileCorruption org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.TestDFSUpgrade org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager org.apache.hadoop.hdfs.TestDatanodeConfig org.apache.hadoop.hdfs.TestEncryptedTransfer org.apache.hadoop.hdfs.TestReplication org.apache.hadoop.hdfs.TestSafeMode org.apache.hadoop.hdfs.server.datanode.TestBPOfferService org.apache.hadoop.hdfs.TestReplaceDatanodeOnFailure The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.datanode.TestBlockReport org.apache.hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFS {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4955//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4955//console This message is automatically generated. DN continues to start up, even if block pool fails to initialize Key: HDFS-2882 URL: https://issues.apache.org/jira/browse/HDFS-2882 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.0.2-alpha Reporter: Todd Lipcon Assignee: Vinay Attachments: HDFS-2882.patch, hdfs-2882.txt I started a DN on a machine that was completely out of space on one of its drives. I saw the following: 2012-02-02 09:56:50,499 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool BP-448349972-172.29.5.192-1323816762969
[jira] [Updated] (HDFS-5038) Backport several branch-2 APIs to branch-1
[ https://issues.apache.org/jira/browse/HDFS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5038: Attachment: HDFS-5038.patch Backport several branch-2 APIs to branch-1 -- Key: HDFS-5038 URL: https://issues.apache.org/jira/browse/HDFS-5038 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.3.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-5038.patch Backport the following several simple API to branch-1: 1. FileSystem#newInstance(Configuration) 2. DFSClient#getNamenode() 3. FileStatus#isDirectory() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764587#comment-13764587 ] Aaron T. Myers commented on HDFS-5158: -- That all sounds good to me. Thanks, Colin. add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5031) BlockScanner scans the block multiple times and on restart scans everything
[ https://issues.apache.org/jira/browse/HDFS-5031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764550#comment-13764550 ] Arpit Agarwal commented on HDFS-5031: - We can add it back if findbugs is unhappy. I'll take a look at the updated patch. Thanks. BlockScanner scans the block multiple times and on restart scans everything --- Key: HDFS-5031 URL: https://issues.apache.org/jira/browse/HDFS-5031 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.1.0-beta Reporter: Vinay Assignee: Vinay Attachments: HDFS-5031.patch, HDFS-5031.patch BlockScanner scans the block twice, also on restart of datanode scans everything. Steps: 1. Write blocks with interval of more than 5 seconds. write new block on completion of scan for written block. Each time datanode scans new block, it also scans, previous block which is already scanned. Now after restart, datanode scans all blocks again. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764551#comment-13764551 ] Colin Patrick McCabe commented on HDFS-5158: How about PathBasedCache? I think that makes it clearer. Agree that we should move the commands under cacheadmin. I'll see if I can do it in this JIRA. add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5186) TestFileJournalManager fails on Windows due to file handle leaks
Chuan Liu created HDFS-5186: --- Summary: TestFileJournalManager fails on Windows due to file handle leaks Key: HDFS-5186 URL: https://issues.apache.org/jira/browse/HDFS-5186 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor We have two unit test cases failing in this class on Windows due a file handle leak in {{getNumberOfTransactions()}} method in this class. {noformat} Running org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager Tests run: 13, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1.693 sec FAILURE! testReadFromMiddleOfEditLog(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 12 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:544) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1078) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1133) at org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager.testReadFromMiddleOfEditLog(TestFileJournalManager.java:436) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) testExcludeInProgressStreams(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 10 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:544) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1078) at
[jira] [Commented] (HDFS-4680) Audit logging of delegation tokens for MR tracing
[ https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764700#comment-13764700 ] Hudson commented on HDFS-4680: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4400 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4400/]) HDFS-4680. Audit logging of delegation tokens for MR tracing. (Andrew Wang) (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1522012) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/TokenIdentifier.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/delegation/DelegationTokenSecretManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/HdfsAuditLogger.java Audit logging of delegation tokens for MR tracing - Key: HDFS-4680 URL: https://issues.apache.org/jira/browse/HDFS-4680 Project: Hadoop HDFS Issue Type: Bug Components: namenode, security Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.3.0 Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch, hdfs-4680-4.patch, hdfs-4680-5.patch HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry. I see a number of potential options for implementing this. 1. Make an optional client name field part of the NN RPC format. We already pass a {{clientName}} as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID. 2. This could be generalized to a set of optional key-value *tags* in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids. 3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2. Thoughts welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4680) Audit logging of delegation tokens for MR tracing
[ https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-4680: -- Resolution: Fixed Fix Version/s: 2.3.0 Status: Resolved (was: Patch Available) Thanks ATM (and everyone who's looked at this), committed to trunk and branch-2. Audit logging of delegation tokens for MR tracing - Key: HDFS-4680 URL: https://issues.apache.org/jira/browse/HDFS-4680 Project: Hadoop HDFS Issue Type: Bug Components: namenode, security Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.3.0 Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch, hdfs-4680-4.patch, hdfs-4680-5.patch HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry. I see a number of potential options for implementing this. 1. Make an optional client name field part of the NN RPC format. We already pass a {{clientName}} as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID. 2. This could be generalized to a set of optional key-value *tags* in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids. 3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2. Thoughts welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5186) TestFileJournalManager fails on Windows due to file handle leaks
[ https://issues.apache.org/jira/browse/HDFS-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu updated HDFS-5186: Status: Patch Available (was: Open) TestFileJournalManager fails on Windows due to file handle leaks Key: HDFS-5186 URL: https://issues.apache.org/jira/browse/HDFS-5186 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: HDFS-5186.patch We have two unit test cases failing in this class on Windows due a file handle leak in {{getNumberOfTransactions()}} method in this class. {noformat} Running org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager Tests run: 13, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1.693 sec FAILURE! testReadFromMiddleOfEditLog(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 12 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:544) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1078) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1133) at org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager.testReadFromMiddleOfEditLog(TestFileJournalManager.java:436) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) testExcludeInProgressStreams(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 10 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at
[jira] [Updated] (HDFS-5186) TestFileJournalManager fails on Windows due to file handle leaks
[ https://issues.apache.org/jira/browse/HDFS-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chuan Liu updated HDFS-5186: Attachment: HDFS-5186.patch Attaching a patch. EditLogInputStream should be closed for each iteration of while loop in {{getNumberOfTransactions()}} method instead of at the very end. TestFileJournalManager fails on Windows due to file handle leaks Key: HDFS-5186 URL: https://issues.apache.org/jira/browse/HDFS-5186 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: HDFS-5186.patch We have two unit test cases failing in this class on Windows due a file handle leak in {{getNumberOfTransactions()}} method in this class. {noformat} Running org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager Tests run: 13, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1.693 sec FAILURE! testReadFromMiddleOfEditLog(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 12 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:544) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1078) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1133) at org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager.testReadFromMiddleOfEditLog(TestFileJournalManager.java:436) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) testExcludeInProgressStreams(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 10 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at
[jira] [Commented] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764707#comment-13764707 ] Alejandro Abdelnur commented on HDFS-5122: -- I would take a slightly different approach for this: The WebHdfsFileSystem#getHttpUrlConnection() method is used for every FS call. In this method, if the URI is for the NN (metadata operation, data transfers will be to a DN) do the following: 1* the standby NM should have a redirector HTTP servlet that bounces all HTTP calls from the standby to the active. 2* Use the same utility classes DistributedFileSystem class uses to determine if hostname in the URI is a logical name or not. 3* If the hostname is no a logical name, do current logic. 4* If the hostname is a logical name, resolve to any of the NN hosts, do a cheap FS call using the chosen hostname. 4.1**If it works cache the chosen hostname and use it for all subsequent FS operations while successful. 4.2** If the call returns a redirect (automatic redirects are disabled) means you hit the standby, select the other hostname and use it for all subsequent FS operations while successful. 4.3** if the call returns a cannot connect or error means you hit a NN that is 'dead', fallback to the other NN hostname and use it for all subsequent FS operations while successful. 5* When a subsequent URL call fails do #4.3 6* Make sure you have logic to avoid infinite loop of bouncing between NNs in case both are dead. 7*The WebHDFS delegation token service should use similar logic like the DFS delegation token to convert from logical name to hostname the service in the token. NOTE: I'm not familiar on WebHDFSFileSystem current retry logic, and some of this could be already take can of it. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5038) Backport several branch-2 APIs to branch-1
[ https://issues.apache.org/jira/browse/HDFS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764724#comment-13764724 ] Suresh Srinivas commented on HDFS-5038: --- +1 for the patch. Backport several branch-2 APIs to branch-1 -- Key: HDFS-5038 URL: https://issues.apache.org/jira/browse/HDFS-5038 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.3.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-5038.patch Backport the following several simple API to branch-1: 1. FileSystem#newInstance(Configuration) 2. DFSClient#getNamenode() 3. FileStatus#isDirectory() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764428#comment-13764428 ] Owen O'Malley commented on HDFS-4953: - {quote} You can't know ahead of time whether your call to mmap will succeed. As I said, mmap can fail, for dozens of reasons. And of course blocks move over time. There is a fundamental time of check, time of use (TOCTOU) race condition in this kind of API. {quote} Ok, I guess I'm fine with the exception assuming the user passed in a null factory. It will be expensive in terms of time, but it won't affect the vast majority of users. {quote} is it necessary to read 200 MB at a time to decode the ORC file format? {quote} Actually, yes. The set of rows that are written together is large (typically ~200MB) so that reading them is efficient. For a 100 column table, that means that you have all of the values for column 1 in the first ~2MB, followed by all of the values for column 2 in the next 2MB, etc. To read the first row, you need all 100 of the 2MB sections. Obviously mmapping this is much more efficient, because the pages of the file can be brought in as needed. {quote} There is already a method named FSDataInputStream#read(ByteBuffer buf) in FSDataInputStream. If we create a new method named FSDataInputStream#readByteBuffer, I would expect there to be some confusion between the two. That's why I proposed FSDataInputStream#readZero for the new name. Does that make sense? {quote} I see your point, but readZero, which sounds like it just fills zeros into a byte buffer, doesn't convey the right meaning. The fundamental action that the user is taking is in fact read. I'd propose that we overload it with the other read and comment it saying that this read supports zero copy while the other doesn't. How does this look? {code} /** * Read a byte buffer from the stream using zero copy if possible. Typically the read will return * maxLength bytes, but it may return fewer at the end of the file system block or the end of the * file. * @param factory a factory that creates ByteBuffers for the read if the region of the file can't be * mmapped. * @param maxLength the maximum number of bytes that will be returned * @return a ByteBuffer with between 1 and maxLength bytes from the file. The buffer should be released * to the stream when the user is done with it. */ public ByteBuffer read(ByteBufferFactory factory, int maxLength) throws IOException; {code} {quote} I'd like to get some other prospective zero-copy API users to comment on whether they like the wrapper object or the DFSInputStream#releaseByteBuffer approach better... {quote} Uh, that is exactly what is happening. I'm a user who is trying to use this interface for a very typical use case of quickly reading bytes that may or may not be on the local machine. I also care a lot about APIs and have been working on Hadoop for 7.75 years. {quote} If, instead of returning a ByteBuffer from the readByteBuffer call, we returned a ZeroBuffer object wrapping the ByteBuffer, we could simply call ZeroBuffer#close() {quote} Users don't want to make interfaces for reading from some Hadoop type named ZeroBuffer. The user wants a ByteBuffer because it is a standard Java type. To make this concrete and crystal clear, I have to make Hive and ORC work with both Hadoop 1.x and Hadoop 2.x. Therefore, if you use a non-standard type I need to wrap it in a shim. That sucks. Especially, if it is in the inner loop, which this absolutely would be. I *need* a ByteBuffer because I can make a shim that always returns a ByteBuffer that works regardless of which version of Hadoop that the user is using. enable HDFS local reads via mmap Key: HDFS-4953 URL: https://issues.apache.org/jira/browse/HDFS-4953 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch Currently, the short-circuit local read pathway allows HDFS clients to access files directly without going through the DataNode. However, all of these reads involve a copy at the operating system level, since they rely on the read() / pread() / etc family of kernel interfaces. We would like to enable HDFS to read local files via mmap. This would enable truly zero-copy reads. In the initial implementation, zero-copy reads will only be performed when checksums were disabled. Later, we can use the DataNode's cache awareness to only perform zero-copy reads when we know that
[jira] [Commented] (HDFS-5186) TestFileJournalManager fails on Windows due to file handle leaks
[ https://issues.apache.org/jira/browse/HDFS-5186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764757#comment-13764757 ] Hadoop QA commented on HDFS-5186: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602627/HDFS-5186.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4956//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4956//console This message is automatically generated. TestFileJournalManager fails on Windows due to file handle leaks Key: HDFS-5186 URL: https://issues.apache.org/jira/browse/HDFS-5186 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Chuan Liu Assignee: Chuan Liu Priority: Minor Attachments: HDFS-5186.patch We have two unit test cases failing in this class on Windows due a file handle leak in {{getNumberOfTransactions()}} method in this class. {noformat} Running org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager Tests run: 13, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 1.693 sec FAILURE! testReadFromMiddleOfEditLog(org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager) Time elapsed: 12 sec ERROR! java.io.IOException: Cannot remove current directory: E:\Monarch\project\hadoop-monarch\hadoop-hdfs-project\hadoop-hdfs\target\test\data\filejournaltest2\current at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:299) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:523) at org.apache.hadoop.hdfs.server.namenode.NNStorage.format(NNStorage.java:544) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1078) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.setupEdits(TestEditLog.java:1133) at org.apache.hadoop.hdfs.server.namenode.TestFileJournalManager.testReadFromMiddleOfEditLog(TestFileJournalManager.java:436) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at
[jira] [Commented] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764788#comment-13764788 ] Jing Zhao commented on HDFS-5122: - bq. 1* the standby NM should have a redirector HTTP servlet that bounces all HTTP calls from the standby to the active. Maybe we do not need to change the NN side here. A WebHdfsFileSystem call can get StandbyException if it hits the SBN, and we can use it for WebHdfsFileSystem side's failover. This is the same logic with current DFSClient's failover and retry mechanism. bq. 4* If the hostname is a logical name, resolve to any of the NN hosts, do a cheap FS call using the chosen hostname. bq. 4.1**If it works cache the chosen hostname and use it for all subsequent FS operations while successful. After the cheap call, we cannot guarantee that the subsequence calls can succeed since the NN failover can happen in between. So I think we can skip the cheap call here. The current WebHDFSFileSystem retry logic does not consider the NN failover. So I think we only need to add the failover (between the 2 NN URL) and retry logic, and use FailoverOnNetworkExceptionRetry as the retry policy in WebHDFSFileSystem for HA setup. Looks like this is what Haohui is doing in his current patch. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764801#comment-13764801 ] Haohui Mai commented on HDFS-5122: -- Hi Alejandro, I believe that this patch realizes your intuitions from 3~6, mostly reusing the current retry logic of WebHDFS. For (1), can you please elaborate your ideas on why introducing an additional redirector on the server side? Right now the client will get a StandbyException and simply retry. For (2), I appreciate if you could kindly give me a pointer to the actual method. For (7), this patch forgets about the delegation token when the client connects to a different name node. Based on my understanding it should be a simple approach to get things working. I'm wondering whether WebHDFS needs a more sophisticated approach here. Can you elaborate how the DFS client handle this case? WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5038) Backport several branch-2 APIs to branch-1
[ https://issues.apache.org/jira/browse/HDFS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-5038: Issue Type: Improvement (was: Bug) Backport several branch-2 APIs to branch-1 -- Key: HDFS-5038 URL: https://issues.apache.org/jira/browse/HDFS-5038 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.3.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Attachments: HDFS-5038.patch Backport the following several simple API to branch-1: 1. FileSystem#newInstance(Configuration) 2. DFSClient#getNamenode() 3. FileStatus#isDirectory() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4680) Audit logging of delegation tokens for MR tracing
[ https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-4680: -- Target Version/s: 3.0.0, 2.3.0, 2.1.1-beta (was: 3.0.0, 2.3.0) Fix Version/s: (was: 2.3.0) 2.1.1-beta And additionally committed to branch-2.1-beta, CHANGES.txt updated. Audit logging of delegation tokens for MR tracing - Key: HDFS-4680 URL: https://issues.apache.org/jira/browse/HDFS-4680 Project: Hadoop HDFS Issue Type: Bug Components: namenode, security Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.1.1-beta Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch, hdfs-4680-4.patch, hdfs-4680-5.patch HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry. I see a number of potential options for implementing this. 1. Make an optional client name field part of the NN RPC format. We already pass a {{clientName}} as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID. 2. This could be generalized to a set of optional key-value *tags* in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids. 3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2. Thoughts welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4680) Audit logging of delegation tokens for MR tracing
[ https://issues.apache.org/jira/browse/HDFS-4680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764802#comment-13764802 ] Hudson commented on HDFS-4680: -- SUCCESS: Integrated in Hadoop-trunk-Commit #4401 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4401/]) Move HDFS-4680 in CHANGES.txt (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1522049) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Audit logging of delegation tokens for MR tracing - Key: HDFS-4680 URL: https://issues.apache.org/jira/browse/HDFS-4680 Project: Hadoop HDFS Issue Type: Bug Components: namenode, security Affects Versions: 2.0.3-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.1.1-beta Attachments: hdfs-4680-1.patch, hdfs-4680-2.patch, hdfs-4680-3.patch, hdfs-4680-4.patch, hdfs-4680-5.patch HDFS audit logging tracks HDFS operations made by different users, e.g. creation and deletion of files. This is useful for after-the-fact root cause analysis and security. However, logging merely the username is insufficient for many usecases. For instance, it is common for a single user to run multiple MapReduce jobs (I believe this is the case with Hive). In this scenario, given a particular audit log entry, it is difficult to trace it back to the MR job or task that generated that entry. I see a number of potential options for implementing this. 1. Make an optional client name field part of the NN RPC format. We already pass a {{clientName}} as a parameter in many RPC calls, so this would essentially make it standardized. MR tasks could then set this field to the job and task ID. 2. This could be generalized to a set of optional key-value *tags* in the NN RPC format, which would then be audit logged. This has standalone benefits outside of just verifying MR task ids. 3. Neither of the above two options actually securely verify that MR clients are who they claim they are. Doing this securely requires the JobTracker to sign MR task attempts, and then having the NN verify this signature. However, this is substantially more work, and could be built on after idea #2. Thoughts welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-5038) Backport several branch-2 APIs to branch-1
[ https://issues.apache.org/jira/browse/HDFS-5038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-5038. - Resolution: Fixed Fix Version/s: 1.3.0 Hadoop Flags: Reviewed Thanks for the review Suresh! I've committed this to branch-1. Backport several branch-2 APIs to branch-1 -- Key: HDFS-5038 URL: https://issues.apache.org/jira/browse/HDFS-5038 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1.3.0 Reporter: Jing Zhao Assignee: Jing Zhao Priority: Minor Fix For: 1.3.0 Attachments: HDFS-5038.patch Backport the following several simple API to branch-1: 1. FileSystem#newInstance(Configuration) 2. DFSClient#getNamenode() 3. FileStatus#isDirectory() -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5187) Deletion of non-existing cluster succeeds
Suresh Srinivas created HDFS-5187: - Summary: Deletion of non-existing cluster succeeds Key: HDFS-5187 URL: https://issues.apache.org/jira/browse/HDFS-5187 Project: Hadoop HDFS Issue Type: Bug Reporter: Suresh Srinivas Following command succeeds even if the cluster of name non-existent has not been added: {noformat} bin/falcon entity -delete -type cluster -name non-existent falcon/abc(cluster) removed successfully {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5187) Deletion of non-existing cluster succeeds
[ https://issues.apache.org/jira/browse/HDFS-5187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764745#comment-13764745 ] Suresh Srinivas commented on HDFS-5187: --- Wrong project. Deletion of non-existing cluster succeeds - Key: HDFS-5187 URL: https://issues.apache.org/jira/browse/HDFS-5187 Project: Hadoop HDFS Issue Type: Bug Reporter: Suresh Srinivas Following command succeeds even if the cluster of name non-existent has not been added: {noformat} bin/falcon entity -delete -type cluster -name non-existent falcon/abc(cluster) removed successfully {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5122: - Attachment: HDFS-5122.patch This patch fully implements HDFS-5122 and HDFS-5181. It reuses the current retry logic of WebHDFS client for fail overs. The key of the patches are the following: 1. Generalize the resolution of the URL. An URL can now correspond to a lists of authorities (i.e., the IPs and ports of NN servers). 2. When a failure happens, it maps the URL to the next available authority and then retries, where the current retry logic bounds the number of retries to avoid locks. 3. The runner class constructs the URL to the real server on demand so that it can always pick up the latest server. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5122: - Attachment: (was: HDFS-5122.patch) WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5156) SafeModeTime metrics sometimes includes non-Safemode time.
[ https://issues.apache.org/jira/browse/HDFS-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira AJISAKA updated HDFS-5156: Affects Version/s: 1.2.1 SafeModeTime metrics sometimes includes non-Safemode time. -- Key: HDFS-5156 URL: https://issues.apache.org/jira/browse/HDFS-5156 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.1.0-beta, 1.2.1 Reporter: Akira AJISAKA Assignee: Akira AJISAKA Labels: metrics Attachments: HDFS-5156.2.patch, HDFS-5156.patch SafeModeTime metrics shows duration in safe mode startup. However, this metrics is set to the time from FSNameSystem starts whenever safe mode leaves. In the result, executing hdfs dfsadmin -safemode enter and hdfs dfsadmin -safemode leave, the metrics includes non-Safemode time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5180) Add time taken to process the command to audit log
[ https://issues.apache.org/jira/browse/HDFS-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764889#comment-13764889 ] Shinichi Yamashita commented on HDFS-5180: -- Thank you for your comment. As you said, the check in the RPC layer is good. And it should output a request of the long processing time. I consider whether I implement it in RPC layer. Add time taken to process the command to audit log -- Key: HDFS-5180 URL: https://issues.apache.org/jira/browse/HDFS-5180 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 3.0.0 Reporter: Shinichi Yamashita Command and ugi are output now by audit log of NameNode. But it is not output for the processing time of command to audit log. For example, we must check which command is a problem when a trouble such as the slow down occurred in NameNode. It should add the processing time to audit log to know the abnormal sign. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13764923#comment-13764923 ] Hadoop QA commented on HDFS-5122: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602663/HDFS-5122.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4957//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4957//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4957//console This message is automatically generated. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5158: --- Attachment: HDFS-5158-caching.007.patch * 'hdfs pathCache' - 'hdfs cacheadmin' * rename all PathCache* classes to PathBasedCache* * fix exception translation issue with 'write permission denied' add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch, HDFS-5158-caching.007.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4299) WebHDFS Should Support HA Configuration
[ https://issues.apache.org/jira/browse/HDFS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-4299: - Assignee: Haohui Mai WebHDFS Should Support HA Configuration --- Key: HDFS-4299 URL: https://issues.apache.org/jira/browse/HDFS-4299 Project: Hadoop HDFS Issue Type: Improvement Reporter: Daisuke Kobayashi Assignee: Haohui Mai WebHDFS clients connect directly to NameNodes rather than use a Hadoop client, so there is no failover capability. Though a workaround is available to use HttpFS with an HA client, WebHDFS also should support HA configuration. Please see also: https://issues.cloudera.org/browse/DISTRO-403 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai updated HDFS-5122: - Attachment: HDFS-5122.001.patch Fix the FindBugs warnings. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.001.patch, HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache
[ https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765006#comment-13765006 ] Haohui Mai commented on HDFS-5167: -- Hi [~ozawa], I'm wondering whether you're still working on it. I'm happy to work on it if you're off the hook. Add metrics about the NameNode retry cache -- Key: HDFS-5167 URL: https://issues.apache.org/jira/browse/HDFS-5167 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Priority: Minor Attachments: HDFS-5167.1.patch It will be helpful to have metrics in NameNode about the retry cache, such as the retry count etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4299) WebHDFS Should Support HA Configuration
[ https://issues.apache.org/jira/browse/HDFS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765029#comment-13765029 ] Aaron T. Myers commented on HDFS-4299: -- This seems like a duplicate of HDFS-5122, but given that that one is further along I suggest we close this one as a dupe. Haohui/Daisuke - do you guys agree? WebHDFS Should Support HA Configuration --- Key: HDFS-4299 URL: https://issues.apache.org/jira/browse/HDFS-4299 Project: Hadoop HDFS Issue Type: Improvement Reporter: Daisuke Kobayashi Assignee: Haohui Mai WebHDFS clients connect directly to NameNodes rather than use a Hadoop client, so there is no failover capability. Though a workaround is available to use HttpFS with an HA client, WebHDFS also should support HA configuration. Please see also: https://issues.cloudera.org/browse/DISTRO-403 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4096) Add snapshot information to namenode WebUI
[ https://issues.apache.org/jira/browse/HDFS-4096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-4096: Assignee: Haohui Mai (was: Jing Zhao) Add snapshot information to namenode WebUI -- Key: HDFS-4096 URL: https://issues.apache.org/jira/browse/HDFS-4096 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Reporter: Jing Zhao Assignee: Haohui Mai Attachments: HDFS-4096.relative.001.patch Add snapshot information to namenode WebUI. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765047#comment-13765047 ] Aaron T. Myers commented on HDFS-5158: -- Looking better, but still a few little issues. +1 once these are addressed: # Need to do an s/pathCache/cacheadmin/g, and also change the description to not mention the path cache: {code} + echo pathCacheconfigure the HDFS path cache {code} # Pretty sure this won't work, since you changed the class name to CacheAdmin: {code} + CLASS=org.apache.hadoop.hdfs.tools.PathCacheAdmin {code} # There's a bunch of places in comments/error messages/help messages where you still reference a path cache, for example: {code} + * This class implements command-line operations on the HDFS Path Cache. {code} In the above example you should probably just remove the word path since this is the more general CacheAdmin class, but in most of the cases you should change it to path-based cache. Just do `grep -i path cache' in the patch to see what I mean. # This patch doesn't move the already-existing cache commands into cacheadmin, but it seems fine to me to do that in a separate JIRA. add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch, HDFS-5158-caching.007.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4299) WebHDFS Should Support HA Configuration
[ https://issues.apache.org/jira/browse/HDFS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765056#comment-13765056 ] Daisuke Kobayashi commented on HDFS-4299: - Looks perfect, ATM. Yeah this should be closed as a dupe. Thanks! WebHDFS Should Support HA Configuration --- Key: HDFS-4299 URL: https://issues.apache.org/jira/browse/HDFS-4299 Project: Hadoop HDFS Issue Type: Improvement Reporter: Daisuke Kobayashi Assignee: Haohui Mai WebHDFS clients connect directly to NameNodes rather than use a Hadoop client, so there is no failover capability. Though a workaround is available to use HttpFS with an HA client, WebHDFS also should support HA configuration. Please see also: https://issues.cloudera.org/browse/DISTRO-403 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4299) WebHDFS Should Support HA Configuration
[ https://issues.apache.org/jira/browse/HDFS-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers resolved HDFS-4299. -- Resolution: Duplicate Assignee: (was: Haohui Mai) Thanks, Daisuke. Closing this one out. WebHDFS Should Support HA Configuration --- Key: HDFS-4299 URL: https://issues.apache.org/jira/browse/HDFS-4299 Project: Hadoop HDFS Issue Type: Improvement Reporter: Daisuke Kobayashi WebHDFS clients connect directly to NameNodes rather than use a Hadoop client, so there is no failover capability. Though a workaround is available to use HttpFS with an HA client, WebHDFS also should support HA configuration. Please see also: https://issues.cloudera.org/browse/DISTRO-403 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5167) Add metrics about the NameNode retry cache
[ https://issues.apache.org/jira/browse/HDFS-5167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765086#comment-13765086 ] Tsuyoshi OZAWA commented on HDFS-5167: -- Hi [~wheat9], I apologize for delay. I've been doing this. I'll share the current status later. Add metrics about the NameNode retry cache -- Key: HDFS-5167 URL: https://issues.apache.org/jira/browse/HDFS-5167 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, namenode Affects Versions: 3.0.0 Reporter: Jing Zhao Priority: Minor Attachments: HDFS-5167.1.patch It will be helpful to have metrics in NameNode about the retry cache, such as the retry count etc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5122) WebHDFS should support logical service names in URIs
[ https://issues.apache.org/jira/browse/HDFS-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765110#comment-13765110 ] Hadoop QA commented on HDFS-5122: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12602711/HDFS-5122.001.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4958//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4958//console This message is automatically generated. WebHDFS should support logical service names in URIs Key: HDFS-5122 URL: https://issues.apache.org/jira/browse/HDFS-5122 Project: Hadoop HDFS Issue Type: Bug Components: ha, webhdfs Affects Versions: 2.1.0-beta Reporter: Arpit Gupta Assignee: Haohui Mai Attachments: HDFS-5122.001.patch, HDFS-5122.patch For example if the dfs.nameservices is set to arpit {code} hdfs dfs -ls webhdfs://arpit:50070/tmp or hdfs dfs -ls webhdfs://arpit/tmp {code} does not work You have to provide the exact active namenode hostname. On an HA cluster using dfs client one should not need to provide the active nn hostname -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5188) Clean up BlockPlacementPolicy and its implementations
Tsz Wo (Nicholas), SZE created HDFS-5188: Summary: Clean up BlockPlacementPolicy and its implementations Key: HDFS-5188 URL: https://issues.apache.org/jira/browse/HDFS-5188 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Below is a list of code cleanup tasks for BlockPlacementPolicy and its implementations: - Define constants for dfs.block.replicator.classname. - Combine adjustExcludedNodes and addToExcludedNodes: subclasses should always add all the related nodes in addToExcludedNodes(..). - Remove duplicated code. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4990) Block placement for storage types
[ https://issues.apache.org/jira/browse/HDFS-4990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765129#comment-13765129 ] Tsz Wo (Nicholas), SZE commented on HDFS-4990: -- I am going to clean up the code in HDFS-5188 before adding storage type to BlockPlacementPolicy. Block placement for storage types - Key: HDFS-4990 URL: https://issues.apache.org/jira/browse/HDFS-4990 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Suresh Srinivas Assignee: Tsz Wo (Nicholas), SZE Attachments: h4990_20130909.patch Currently block location for writes are made based on: # Datanode load (number of transceivers) # Space left on datanode # Topology With storage abstraction, namenode must choose a storage instead of a datanode for block placement. It also needs to consider storage type, load on the storage etc. As an additional benefit, currently HDFS support heterogeneous nodes (nodes with different number of spindles etc.) poorly. This work should help solve that issue as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-5158: --- Attachment: HDFS-5158-caching.008.patch add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch, HDFS-5158-caching.007.patch, HDFS-5158-caching.008.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765136#comment-13765136 ] Colin Patrick McCabe commented on HDFS-5158: I changed all the remaining path cache instances to PathBasedCache, fixed the help, and ran some command tests. Will commit shortly if there's no more comments add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch, HDFS-5158-caching.007.patch, HDFS-5158-caching.008.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4953) enable HDFS local reads via mmap
[ https://issues.apache.org/jira/browse/HDFS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13765144#comment-13765144 ] Suresh Srinivas commented on HDFS-4953: --- There is general consensus that the existing APIs can be cleaned up. I like what [~owen.omalley] is proposing. I am going to create a new jira where we can decide the final details of the API. enable HDFS local reads via mmap Key: HDFS-4953 URL: https://issues.apache.org/jira/browse/HDFS-4953 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.3.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: benchmark.png, HDFS-4953.001.patch, HDFS-4953.002.patch, HDFS-4953.003.patch, HDFS-4953.004.patch, HDFS-4953.005.patch, HDFS-4953.006.patch, HDFS-4953.007.patch, HDFS-4953.008.patch Currently, the short-circuit local read pathway allows HDFS clients to access files directly without going through the DataNode. However, all of these reads involve a copy at the operating system level, since they rely on the read() / pread() / etc family of kernel interfaces. We would like to enable HDFS to read local files via mmap. This would enable truly zero-copy reads. In the initial implementation, zero-copy reads will only be performed when checksums were disabled. Later, we can use the DataNode's cache awareness to only perform zero-copy reads when we know that checksum has already been verified. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5189) Rename the CorruptBlocks metric to CorruptReplicas
Harsh J created HDFS-5189: - Summary: Rename the CorruptBlocks metric to CorruptReplicas Key: HDFS-5189 URL: https://issues.apache.org/jira/browse/HDFS-5189 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.1.0-beta Reporter: Harsh J Assignee: Harsh J Priority: Minor The NameNode increments a CorruptBlocks metric even if only one of the block's replicas is reported corrupt (genuine checksum fail, or even if a replica has a bad genstamp). In cases where this is incremented, fsck still reports a healthy state. This is confusing to users and causes false alarm as they feel this is to be monitored (instead of MissingBlocks). The metric is truly trying to report only corrupt replicas, not whole blocks, and ought to be renamed. FWIW, the dfsadmin -report reports a proper string of Blocks with corrupt replicas: when printing this count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-5158) add command-line support for manipulating cache directives
[ https://issues.apache.org/jira/browse/HDFS-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-5158. Resolution: Fixed Fix Version/s: HDFS-4949 committed; thanks all add command-line support for manipulating cache directives -- Key: HDFS-5158 URL: https://issues.apache.org/jira/browse/HDFS-5158 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Fix For: HDFS-4949 Attachments: HDFS-5158-caching.003.patch, HDFS-5158-caching.004.patch, HDFS-5158-caching.005.patch, HDFS-5158-caching.006.patch, HDFS-5158-caching.007.patch, HDFS-5158-caching.008.patch We should add command-line support for creating, removing, and listing cache directives. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-5173) prettier dfsadmin -listCachePools output
[ https://issues.apache.org/jira/browse/HDFS-5173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe resolved HDFS-5173. Resolution: Duplicate Fix Version/s: HDFS-4949 Target Version/s: HDFS-4949 did this in HDFS-5158 prettier dfsadmin -listCachePools output Key: HDFS-5173 URL: https://issues.apache.org/jira/browse/HDFS-5173 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode, namenode Affects Versions: HDFS-4949 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: HDFS-4949 We can make the output of {{-listCachePools}} prettier by doing the same kinds of things as {{ls -l}} does. We probably need to find (or write) some code to properly line up columns to avoid ragged-looking output. Finally, column headers will be necessary here to let people know what they're looking at. Ideally, we'd implement a format-string type argument like {{--format=}} for ls. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-5190) move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI
Colin Patrick McCabe created HDFS-5190: -- Summary: move cache pool manipulation commands to dfsadmin, add to TestHDFSCLI Key: HDFS-5190 URL: https://issues.apache.org/jira/browse/HDFS-5190 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Colin Patrick McCabe As per the discussion in HDFS-5158, we should move the cache pool add, remove, list commands into cacheadmin. We also should write a unit test in TestHDFSCLI for these commands. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-5189) Rename the CorruptBlocks metric to CorruptReplicas
[ https://issues.apache.org/jira/browse/HDFS-5189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-5189: -- Target Version/s: 3.0.0, 2.3.0 Rename the CorruptBlocks metric to CorruptReplicas -- Key: HDFS-5189 URL: https://issues.apache.org/jira/browse/HDFS-5189 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.1.0-beta Reporter: Harsh J Assignee: Harsh J Priority: Minor The NameNode increments a CorruptBlocks metric even if only one of the block's replicas is reported corrupt (genuine checksum fail, or even if a replica has a bad genstamp). In cases where this is incremented, fsck still reports a healthy state. This is confusing to users and causes false alarm as they feel this is to be monitored (instead of MissingBlocks). The metric is truly trying to report only corrupt replicas, not whole blocks, and ought to be renamed. FWIW, the dfsadmin -report reports a proper string of Blocks with corrupt replicas: when printing this count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira