[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones
[ https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401197#comment-13401197 ] Hadoop QA commented on HDFS-3535: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533399/hdfs-3535-2.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2700//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2700//console This message is automatically generated. > audit logging should log denied accesses as well as permitted ones > -- > > Key: HDFS-3535 > URL: https://issues.apache.org/jira/browse/HDFS-3535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs-3535-1.txt, hdfs-3535-2.txt, hdfs-3535.txt > > > FSNamesystem.java logs an audit log entry when a user successfully accesses > the filesystem: > {code} > logAuditEvent(UserGroupInformation.getLoginUser(), > Server.getRemoteIp(), > "concat", Arrays.toString(srcs), target, resultingStat); > {code} > but there is no similar log when a user attempts to access the filesystem and > is denied due to permissions. Competing systems do provide such logging of > denied access attempts; we should too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3559) DFSTestUtil: use Builder class to construct DFSTestUtil instances
[ https://issues.apache.org/jira/browse/HDFS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3559: --- Attachment: HDFS-3559.002.patch * "public static" instead of "static public" class * make some of the instance variables of DFSTestUtil final > DFSTestUtil: use Builder class to construct DFSTestUtil instances > - > > Key: HDFS-3559 > URL: https://issues.apache.org/jira/browse/HDFS-3559 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Fix For: 2.0.1-alpha > > Attachments: HDFS-3559.001.patch, HDFS-3559.002.patch > > > The number of parameters in DFSTestUtil's constructor has grown over time. > It would be nice to have a Builder class similar to MiniDFSClusterBuilder, > which could construct an instance of DFSTestUtil. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401190#comment-13401190 ] Hudson commented on HDFS-3498: -- Integrated in Hadoop-Common-trunk-Commit #2388 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2388/]) HDFS-3498. Support replica removal in BlockPlacementPolicy and make BlockPlacementPolicyDefault extensible for reusing code in subclasses. Contributed by Junping Du (Revision 1353807) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Fix For: 3.0.0 > > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem
[ https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401189#comment-13401189 ] Hudson commented on HDFS-3516: -- Integrated in Hadoop-Common-trunk-Commit #2388 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2388/]) HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java > Check content-type in WebHdfsFileSystem > --- > > Key: HDFS-3516 > URL: https://issues.apache.org/jira/browse/HDFS-3516 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.1-alpha > > Attachments: h3516_20120607.patch, h3516_20120608.patch, > h3516_20120609.patch > > > WebHdfsFileSystem currently tries to parse the response as json. It may be a > good idea to check the content-type before parsing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem
[ https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401185#comment-13401185 ] Hudson commented on HDFS-3516: -- Integrated in Hadoop-Hdfs-trunk-Commit #2457 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2457/]) HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java > Check content-type in WebHdfsFileSystem > --- > > Key: HDFS-3516 > URL: https://issues.apache.org/jira/browse/HDFS-3516 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.1-alpha > > Attachments: h3516_20120607.patch, h3516_20120608.patch, > h3516_20120609.patch > > > WebHdfsFileSystem currently tries to parse the response as json. It may be a > good idea to check the content-type before parsing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401186#comment-13401186 ] Hudson commented on HDFS-3498: -- Integrated in Hadoop-Hdfs-trunk-Commit #2457 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2457/]) HDFS-3498. Support replica removal in BlockPlacementPolicy and make BlockPlacementPolicyDefault extensible for reusing code in subclasses. Contributed by Junping Du (Revision 1353807) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Fix For: 3.0.0 > > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder
[ https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401184#comment-13401184 ] Uma Maheswara Rao G commented on HDFS-3541: --- for the comment: {quote} 2.This chunk of code confuses me, since you don't use written again after the loop, and there doesn't seem to be any need to call write(...) many times: {quote} try using util APIs already available for writing data. @Kihwal, good point, worth asserting block finalization. > Deadlock between recovery, xceiver and packet responder > --- > > Key: HDFS-3541 > URL: https://issues.apache.org/jira/browse/HDFS-3541 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: suja s >Assignee: Vinay > Attachments: DN_dump.rar, HDFS-3541.patch > > > Block Recovery initiated while write in progress at Datanode side. Found a > lock between recovery, xceiver and packet responder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
[ https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401175#comment-13401175 ] Hadoop QA commented on HDFS-2617: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533408/hdfs-2617-1.1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2699//console This message is automatically generated. > Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution > -- > > Key: HDFS-2617 > URL: https://issues.apache.org/jira/browse/HDFS-2617 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Reporter: Jakob Homan >Assignee: Jakob Homan > Fix For: 2.0.1-alpha > > Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, > HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, > HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch > > > The current approach to secure and authenticate nn web services is based on > Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now > that we have one, we can get rid of the non-standard KSSL and use SPNEGO > throughout. This will simplify setup and configuration. Also, Kerberized > SSL is a non-standard approach with its own quirks and dark corners > (HDFS-2386). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3551) WebHDFS CREATE does not use client location for redirection
[ https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3551: - Status: Patch Available (was: Open) > WebHDFS CREATE does not use client location for redirection > --- > > Key: HDFS-3551 > URL: https://issues.apache.org/jira/browse/HDFS-3551 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.0.0-alpha, 1.0.0 >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h3551_20120620.patch, h3551_20120625.patch > > > CREATE currently redirects client to a random datanode but not using the > client location information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3551) WebHDFS CREATE does not use client location for redirection
[ https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3551: - Attachment: h3551_20120625.patch h3551_20120625: adds a test > WebHDFS CREATE does not use client location for redirection > --- > > Key: HDFS-3551 > URL: https://issues.apache.org/jira/browse/HDFS-3551 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h3551_20120620.patch, h3551_20120625.patch > > > CREATE currently redirects client to a random datanode but not using the > client location information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401166#comment-13401166 ] Hudson commented on HDFS-3498: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2407 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2407/]) HDFS-3498. Support replica removal in BlockPlacementPolicy and make BlockPlacementPolicyDefault extensible for reusing code in subclasses. Contributed by Junping Du (Revision 1353807) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353807 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicy.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockPlacementPolicyDefault.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestReplicationPolicy.java > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Fix For: 3.0.0 > > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode
[ https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401163#comment-13401163 ] Vinay commented on HDFS-3507: - Thanks Aaron., I really did not know about it.. > DFS#isInSafeMode needs to execute only on Active NameNode > - > > Key: HDFS-3507 > URL: https://issues.apache.org/jira/browse/HDFS-3507 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Vinay >Assignee: Vinay > Attachments: HDFS-3507.patch > > > Currently DFS#isInSafeMode is not Checking for the NN state. It can be > executed on any of the NNs. > But HBase will use this API to check for the NN safemode before starting up > its service. > If first NN configured is in standby then DFS#isInSafeMode will check standby > NNs safemode but hbase want state of Active NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3516) Check content-type in WebHdfsFileSystem
[ https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3516: - Resolution: Fixed Fix Version/s: 2.0.1-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed this. > Check content-type in WebHdfsFileSystem > --- > > Key: HDFS-3516 > URL: https://issues.apache.org/jira/browse/HDFS-3516 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 2.0.1-alpha > > Attachments: h3516_20120607.patch, h3516_20120608.patch, > h3516_20120609.patch > > > WebHdfsFileSystem currently tries to parse the response as json. It may be a > good idea to check the content-type before parsing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3498: - Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) I have committed this. Thanks, Junping! > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Fix For: 3.0.0 > > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401138#comment-13401138 ] Tsz Wo (Nicholas), SZE commented on HDFS-3498: -- +1 The v5 patch looks good. Since the changes are minor, I will commit it without waiting for Jenkins again. > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Fix For: 3.0.0 > > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401136#comment-13401136 ] Junping Du commented on HDFS-3498: -- Thanks. Nicholas. I add a few comments of javadoc in new patch (without code change). > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3498: - Attachment: HDFS-3498-v5.patch > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498-v5.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder
[ https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401135#comment-13401135 ] Kihwal Lee commented on HDFS-3541: -- The patch looks okay but I was wondering whether the test can be improved. The test in the current patch does not directly recreate the original race condition. Probably an artificial deadlock can be created by creating a thread which does sleep and then kills the writer inside a {{synchronized(datanode.data)}} block. While it's sleeping, another thread could try closing the {{DFSOutputStream}}. This should fail when the writer (i.e. the {{DataXceiver}} thread) is killed and streams get closed. After this we could verify the block is not finalized. Then we know the {{PacketResponder}} thread didn't finalize the block. Does it make sense? > Deadlock between recovery, xceiver and packet responder > --- > > Key: HDFS-3541 > URL: https://issues.apache.org/jira/browse/HDFS-3541 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: suja s >Assignee: Vinay > Attachments: DN_dump.rar, HDFS-3541.patch > > > Block Recovery initiated while write in progress at Datanode side. Found a > lock between recovery, xceiver and packet responder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3516) Check content-type in WebHdfsFileSystem
[ https://issues.apache.org/jira/browse/HDFS-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401133#comment-13401133 ] Hudson commented on HDFS-3516: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2406 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2406/]) HDFS-3516. Check content-type in WebHdfsFileSystem. (Revision 1353800) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353800 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHdfsFileSystemContract.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/WebHdfsTestUtil.java > Check content-type in WebHdfsFileSystem > --- > > Key: HDFS-3516 > URL: https://issues.apache.org/jira/browse/HDFS-3516 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h3516_20120607.patch, h3516_20120608.patch, > h3516_20120609.patch > > > WebHdfsFileSystem currently tries to parse the response as json. It may be a > good idea to check the content-type before parsing it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3498: - Component/s: (was: data-node) name-node Hadoop Flags: Reviewed +1 patch looks good. > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3498) Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3498: - Summary: Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault extensible for reusing code in subclass (was: Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass) > Make Replica Removal Policy pluggable and ReplicaPlacementPolicyDefault > extensible for reusing code in subclass > --- > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2988) Improve error message when storage directory lock fails
[ https://issues.apache.org/jira/browse/HDFS-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401131#comment-13401131 ] Hadoop QA commented on HDFS-2988: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533385/HDFS-2988.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2697//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2697//console This message is automatically generated. > Improve error message when storage directory lock fails > --- > > Key: HDFS-2988 > URL: https://issues.apache.org/jira/browse/HDFS-2988 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Reporter: Todd Lipcon >Priority: Minor > Labels: newbie > Attachments: HDFS-2988.patch, HDFS-2988.patch, HDFS-2988.patch > > > Currently, the error message is fairly opaque to a non-developer ("Cannot > lock storage" or something). Instead, we should have some improvments: > - when we create the in_use.lock file, we should write the hostname/PID that > locked the file > - if the lock fails, and in_use.lock exists, the error message should say > something like "It appears that another namenode (pid 23423 on host > foo.example.com) has already locked the storage directory." > - if the lock fails, and no lock file exists, the error message should say > something like "if this storage directory is mounted via NFS, ensure that the > appropriate nfs lock services are running." -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3566) Custom Replication Policy for Azure
Sumadhur Reddy Bolli created HDFS-3566: -- Summary: Custom Replication Policy for Azure Key: HDFS-3566 URL: https://issues.apache.org/jira/browse/HDFS-3566 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli Azure has logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. This policy would spread the datanodes across atleast 2 fault domains and three upgrade domains to prevent data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode
[ https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401078#comment-13401078 ] Hadoop QA commented on HDFS-3507: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12532364/HDFS-3507.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 11 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 javadoc. The javadoc tool appears to have generated 2 warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2698//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2698//console This message is automatically generated. > DFS#isInSafeMode needs to execute only on Active NameNode > - > > Key: HDFS-3507 > URL: https://issues.apache.org/jira/browse/HDFS-3507 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Vinay >Assignee: Vinay > Attachments: HDFS-3507.patch > > > Currently DFS#isInSafeMode is not Checking for the NN state. It can be > executed on any of the NNs. > But HBase will use this API to check for the NN safemode before starting up > its service. > If first NN configured is in standby then DFS#isInSafeMode will check standby > NNs safemode but hbase want state of Active NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3565) Fix streaming job failures with WindowsResourceCalculatorPlugin
Bikas Saha created HDFS-3565: Summary: Fix streaming job failures with WindowsResourceCalculatorPlugin Key: HDFS-3565 URL: https://issues.apache.org/jira/browse/HDFS-3565 Project: Hadoop HDFS Issue Type: Bug Reporter: Bikas Saha Assignee: Bikas Saha Some streaming jobs use local mode job runs that do not start tasks trackers. In these cases, the jvm context is not setup and hence local mode execution causes the code to crash. Fix is to not not use ResourceCalculatorPlugin in such cases or make the local job run creating dummy jvm contexts. Choosing the first option because thats the current implicit behavior in Linux. The ProcfsBasedProcessTree (used inside the LinuxResourceCalculatorPlugin) does no real work when the process pid is not setup correctly. This is what happens when local job mode runs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-2386) with security enabled fsck calls lead to handshake_failure and hftp fails throwing the same exception in the logs
[ https://issues.apache.org/jira/browse/HDFS-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-2386. - Resolution: Invalid Fixed via HDFS-2617. > with security enabled fsck calls lead to handshake_failure and hftp fails > throwing the same exception in the logs > - > > Key: HDFS-2386 > URL: https://issues.apache.org/jira/browse/HDFS-2386 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.20.205.0 >Reporter: Arpit Gupta > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
[ https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HDFS-2617: Attachment: hdfs-2617-1.1.patch Here's the same patch resolving some conflicts for branch-1. This compiles, but I still need to test it out. > Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution > -- > > Key: HDFS-2617 > URL: https://issues.apache.org/jira/browse/HDFS-2617 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Reporter: Jakob Homan >Assignee: Jakob Homan > Fix For: 2.0.1-alpha > > Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, > HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, > HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, hdfs-2617-1.1.patch > > > The current approach to secure and authenticate nn web services is based on > Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now > that we have one, we can get rid of the non-standard KSSL and use SPNEGO > throughout. This will simplify setup and configuration. Also, Kerberized > SSL is a non-standard approach with its own quirks and dark corners > (HDFS-2386). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3564) Make the replication policy pluggable to allow custom replication policies
Sumadhur Reddy Bolli created HDFS-3564: -- Summary: Make the replication policy pluggable to allow custom replication policies Key: HDFS-3564 URL: https://issues.apache.org/jira/browse/HDFS-3564 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli ReplicationTargetChooser currently determines the placement of replicas in hadoop. Making the replication policy pluggable would help in having custom replication policies that suit the environment. Eg1: Enabling placing replicas across different datacenters(not just racks) Eg2: Enabling placing replicas across multiple(more than 2) racks Eg3: Cloud environments like azure have logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. Custom policies to address issues like these can be written if we make the policy pluggable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3559) DFSTestUtil: use Builder class to construct DFSTestUtil instances
[ https://issues.apache.org/jira/browse/HDFS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401037#comment-13401037 ] Aaron T. Myers commented on HDFS-3559: -- Patch looks really good to me. Just a few little nits: # I think doing "public static class" is a little more common than "static public class" throughout the project. # Seems like the instance vars in DFSTestUtil can reasonably be made final. +1 once these are addressed. > DFSTestUtil: use Builder class to construct DFSTestUtil instances > - > > Key: HDFS-3559 > URL: https://issues.apache.org/jira/browse/HDFS-3559 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Fix For: 2.0.1-alpha > > Attachments: HDFS-3559.001.patch > > > The number of parameters in DFSTestUtil's constructor has grown over time. > It would be nice to have a Builder class similar to MiniDFSClusterBuilder, > which could construct an instance of DFSTestUtil. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3535) audit logging should log denied accesses as well as permitted ones
[ https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson updated HDFS-3535: Attachment: hdfs-3535-2.txt Attaching hdfs-3535-2.txt adopting @Before/@After annotations. > audit logging should log denied accesses as well as permitted ones > -- > > Key: HDFS-3535 > URL: https://issues.apache.org/jira/browse/HDFS-3535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs-3535-1.txt, hdfs-3535-2.txt, hdfs-3535.txt > > > FSNamesystem.java logs an audit log entry when a user successfully accesses > the filesystem: > {code} > logAuditEvent(UserGroupInformation.getLoginUser(), > Server.getRemoteIp(), > "concat", Arrays.toString(srcs), target, resultingStat); > {code} > but there is no similar log when a user attempts to access the filesystem and > is denied due to permissions. Competing systems do provide such logging of > denied access attempts; we should too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3526) Standy NameNode is entering into Safemode even after HDFS-2914 due to resources low
[ https://issues.apache.org/jira/browse/HDFS-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401022#comment-13401022 ] Aaron T. Myers commented on HDFS-3526: -- Hi Vinay, I'm not sure I agree with the premise of this JIRA. The issue that precipitated HDFS-2914 was that if the shared edits dir temporarily disappeared, the Standby NN should not enter safemode. If the standby starts up fresh and its disks are full, I see no reason it shouldn't go into safemode. Thoughts? > Standy NameNode is entering into Safemode even after HDFS-2914 due to > resources low > --- > > Key: HDFS-3526 > URL: https://issues.apache.org/jira/browse/HDFS-3526 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Brahma Reddy Battula >Assignee: Vinay > Attachments: HDFS-3526.patch > > > Scenario: > = > Start ANN SNN with One DN > Make 100% disk full for SNN > Now restart SNN.. > Here SNN is going safemode..But it shouldnot happen according to HDFS-2914 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3461) HFTP should use the same port & protocol for getting the delegation token
[ https://issues.apache.org/jira/browse/HDFS-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401013#comment-13401013 ] Owen O'Malley commented on HDFS-3461: - This is the 1.1 branch with HDFS-2617 applied. > HFTP should use the same port & protocol for getting the delegation token > - > > Key: HDFS-3461 > URL: https://issues.apache.org/jira/browse/HDFS-3461 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 1.1.0 > > > Currently, hftp uses http to the Namenode's https port, which doesn't work. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3554) TestRaidNode is failing
[ https://issues.apache.org/jira/browse/HDFS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401009#comment-13401009 ] Weiyan Wang commented on HDFS-3554: --- Do you mean I should use MiniMRYarnCluster instead of MiniMRCluster? Is there any example I could follow to start a job history server? > TestRaidNode is failing > --- > > Key: HDFS-3554 > URL: https://issues.apache.org/jira/browse/HDFS-3554 > Project: Hadoop HDFS > Issue Type: Bug > Components: contrib/raid, test >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Weiyan Wang > > After MAPREDUCE-3868 re-enabled raid, TestRaidNode has been failing in > Jenkins builds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3507) DFS#isInSafeMode needs to execute only on Active NameNode
[ https://issues.apache.org/jira/browse/HDFS-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401006#comment-13401006 ] Aaron T. Myers commented on HDFS-3507: -- Hi Vinay, merely marking a patch open/PA again won't trigger another build. You either need to attach another file (could have the same content) or get someone to kick the HDFS pre-commit build. I've just done the latter for you. > DFS#isInSafeMode needs to execute only on Active NameNode > - > > Key: HDFS-3507 > URL: https://issues.apache.org/jira/browse/HDFS-3507 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Vinay >Assignee: Vinay > Attachments: HDFS-3507.patch > > > Currently DFS#isInSafeMode is not Checking for the NN state. It can be > executed on any of the NNs. > But HBase will use this API to check for the NN safemode before starting up > its service. > If first NN configured is in standby then DFS#isInSafeMode will check standby > NNs safemode but hbase want state of Active NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1469) TestBlockTokenWithDFS fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-1469. --- Resolution: Cannot Reproduce > TestBlockTokenWithDFS fails on trunk > > > Key: HDFS-1469 > URL: https://issues.apache.org/jira/browse/HDFS-1469 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Konstantin Boudnik >Priority: Blocker > Attachments: failed-TestBlockTokenWithDFS.txt, log.gz > > > TestBlockTokenWithDFS is failing on trunk: > Testcase: testAppend took 31.569 sec > FAILED > null > junit.framework.AssertionFailedError: null > at > org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS.testAppend(TestBlockTokenWithDFS.java:223) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3170) Add more useful metrics for write latency
[ https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400985#comment-13400985 ] Hadoop QA commented on HDFS-3170: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12533205/hdfs-3170.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 1 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2696//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2696//console This message is automatically generated. > Add more useful metrics for write latency > - > > Key: HDFS-3170 > URL: https://issues.apache.org/jira/browse/HDFS-3170 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Matthew Jacobs > Attachments: hdfs-3170.txt > > > Currently, the only write-latency related metric we expose is the total > amount of time taken by opWriteBlock. This is practically useless, since (a) > different blocks may be wildly different sizes, and (b) if the writer is only > generating data slowly, it will make a block write take longer by no fault of > the DN. I would like to propose two new metrics: > 1) *flush-to-disk time*: count how long it takes for each call to flush an > incoming packet to disk (including the checksums). In most cases this will be > close to 0, as it only flushes to buffer cache, but if the backing block > device enters congested writeback, it can take much longer, which provides an > interesting metric. > 2) *round trip to downstream pipeline node*: track the round trip latency for > the part of the pipeline between the local node and its downstream neighbors. > When we add a new packet to the ack queue, save the current timestamp. When > we receive an ack, update the metric based on how long since we sent the > original packet. This gives a metric of the total RTT through the pipeline. > If we also include this metric in the ack to upstream, we can subtract the > amount of time due to the later stages in the pipeline and have an accurate > count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones
[ https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400963#comment-13400963 ] Andy Isaacson commented on HDFS-3535: - {quote} Forgot to mention, in TestAuditLogs use @Before and @After to setup/teardown cluster and fs in one place (see other tests for an example) {quote} Thanks, that makes the tests a lot nicer! I'll post a new patch using that. > audit logging should log denied accesses as well as permitted ones > -- > > Key: HDFS-3535 > URL: https://issues.apache.org/jira/browse/HDFS-3535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs-3535-1.txt, hdfs-3535.txt > > > FSNamesystem.java logs an audit log entry when a user successfully accesses > the filesystem: > {code} > logAuditEvent(UserGroupInformation.getLoginUser(), > Server.getRemoteIp(), > "concat", Arrays.toString(srcs), target, resultingStat); > {code} > but there is no similar log when a user attempts to access the filesystem and > is denied due to permissions. Competing systems do provide such logging of > denied access attempts; we should too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2617) Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution
[ https://issues.apache.org/jira/browse/HDFS-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400948#comment-13400948 ] Owen O'Malley commented on HDFS-2617: - The patch in HDP-1 is just the one above. I have a variant of it for Hadoop 1.1 that I'll upload shortly. > Replaced Kerberized SSL for image transfer and fsck with SPNEGO-based solution > -- > > Key: HDFS-2617 > URL: https://issues.apache.org/jira/browse/HDFS-2617 > Project: Hadoop HDFS > Issue Type: Improvement > Components: security >Reporter: Jakob Homan >Assignee: Jakob Homan > Fix For: 2.0.1-alpha > > Attachments: HDFS-2617-a.patch, HDFS-2617-b.patch, > HDFS-2617-config.patch, HDFS-2617-trunk.patch, HDFS-2617-trunk.patch, > HDFS-2617-trunk.patch, HDFS-2617-trunk.patch > > > The current approach to secure and authenticate nn web services is based on > Kerberized SSL and was developed when a SPNEGO solution wasn't available. Now > that we have one, we can get rid of the non-standard KSSL and use SPNEGO > throughout. This will simplify setup and configuration. Also, Kerberized > SSL is a non-standard approach with its own quirks and dark corners > (HDFS-2386). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2988) Improve error message when storage directory lock fails
[ https://issues.apache.org/jira/browse/HDFS-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Miomir Boljanovic updated HDFS-2988: Attachment: HDFS-2988.patch Supposedly, previous patch caused org.apache.hadoop.hdfs.TestDatanodeBlockScanner to fail because I wrongly instantiated StorageDirectory I realized afterwards that MiniDFSCluster should be used to instantiate StorageDirectory. > Improve error message when storage directory lock fails > --- > > Key: HDFS-2988 > URL: https://issues.apache.org/jira/browse/HDFS-2988 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Reporter: Todd Lipcon >Priority: Minor > Labels: newbie > Attachments: HDFS-2988.patch, HDFS-2988.patch, HDFS-2988.patch > > > Currently, the error message is fairly opaque to a non-developer ("Cannot > lock storage" or something). Instead, we should have some improvments: > - when we create the in_use.lock file, we should write the hostname/PID that > locked the file > - if the lock fails, and in_use.lock exists, the error message should say > something like "It appears that another namenode (pid 23423 on host > foo.example.com) has already locked the storage directory." > - if the lock fails, and no lock file exists, the error message should say > something like "if this storage directory is mounted via NFS, ensure that the > appropriate nfs lock services are running." -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3170) Add more useful metrics for write latency
[ https://issues.apache.org/jira/browse/HDFS-3170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Jacobs updated HDFS-3170: - Status: Patch Available (was: Open) The attached patch adds the write-latency related metrics described in this JIRA. The tests verify that the metrics are added. I manually checked that the averaged latency values were reasonable. For example, I added a sleep before taking the ack end time and then verified that the resulting metric (via jmx) was greater than the sleep time. > Add more useful metrics for write latency > - > > Key: HDFS-3170 > URL: https://issues.apache.org/jira/browse/HDFS-3170 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Matthew Jacobs > Attachments: hdfs-3170.txt > > > Currently, the only write-latency related metric we expose is the total > amount of time taken by opWriteBlock. This is practically useless, since (a) > different blocks may be wildly different sizes, and (b) if the writer is only > generating data slowly, it will make a block write take longer by no fault of > the DN. I would like to propose two new metrics: > 1) *flush-to-disk time*: count how long it takes for each call to flush an > incoming packet to disk (including the checksums). In most cases this will be > close to 0, as it only flushes to buffer cache, but if the backing block > device enters congested writeback, it can take much longer, which provides an > interesting metric. > 2) *round trip to downstream pipeline node*: track the round trip latency for > the part of the pipeline between the local node and its downstream neighbors. > When we add a new packet to the ack queue, save the current timestamp. When > we receive an ack, update the metric based on how long since we sent the > original packet. This gives a metric of the total RTT through the pipeline. > If we also include this metric in the ack to upstream, we can subtract the > amount of time due to the later stages in the pipeline and have an accurate > count of this particular link. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3541) Deadlock between recovery, xceiver and packet responder
[ https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400888#comment-13400888 ] Aaron T. Myers commented on HDFS-3541: -- Patch looks pretty good to me. Just two small comments: # Misspelled "interrupted": "Finalizing block from Inturrupted thread should fail" # This chunk of code confuses me, since you don't use {{written}} again after the loop, and there doesn't seem to be any need to call {{write(...)}} many times: {code} + int written = 0; + for (; written < 512;) { +out.writeBytes(data); +written += 4; + } {code} Kihwal, how does this patch look to you? > Deadlock between recovery, xceiver and packet responder > --- > > Key: HDFS-3541 > URL: https://issues.apache.org/jira/browse/HDFS-3541 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.23.3, 2.0.1-alpha >Reporter: suja s >Assignee: Vinay > Attachments: DN_dump.rar, HDFS-3541.patch > > > Block Recovery initiated while write in progress at Datanode side. Found a > lock between recovery, xceiver and packet responder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3535) audit logging should log denied accesses as well as permitted ones
[ https://issues.apache.org/jira/browse/HDFS-3535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400859#comment-13400859 ] Andy Isaacson commented on HDFS-3535: - {quote} The one audit log that doesn't have a corresponding log for failure is logFsckEvent, though given that we get the ugi from the request it seems like that case could result in an ACE as well right? {quote} the fsck audit event is logged before the fsck command is run, so it can't fail to generate the audit event. Also fsck is special in that it's implemented as a URL fetch, so I don't think the UGI is enforced. This is probably a bug, and the audit logging will need to be fixed when that bug is fixed. {quote} Let's use fooInternal vs fooInt to match the existing "fooInternal" methods {quote} That would collide with several existing uses: concatInternal, createSymlinkInternal, startFileInternal, renameToInternal, etc. I specifically chose a suffix not previously used to avoid code churn. Perhaps a different suffix than "Int" would convey this better, LMK if you have any good ideas. {quote} Normally the checks are used before the method invocation if we're doing expensive things to create the args (eg lots of string concatenation) not to save the cost of the method invocation. Doesn't look like that's the case here (we're not constructing args) so we could just call logAuditEvent directly everywhere. {quote} There are a bunch of uses of logAuditEvent that do need to check if audit logging is enabled before constructing log messages, etc. I considered refactoring them all and concluded that it was out of scope for this change. I decided not to change the existing idiom (verbose though it is) before refactoring all users of the interface, which should be a separate change. > audit logging should log denied accesses as well as permitted ones > -- > > Key: HDFS-3535 > URL: https://issues.apache.org/jira/browse/HDFS-3535 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs-3535-1.txt, hdfs-3535.txt > > > FSNamesystem.java logs an audit log entry when a user successfully accesses > the filesystem: > {code} > logAuditEvent(UserGroupInformation.getLoginUser(), > Server.getRemoteIp(), > "concat", Arrays.toString(srcs), target, resultingStat); > {code} > but there is no similar log when a user attempts to access the filesystem and > is denied due to permissions. Competing systems do provide such logging of > denied access attempts; we should too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400851#comment-13400851 ] Hudson commented on HDFS-3549: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2403 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2403/]) HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason Lowe via daryn) (Revision 1353695) Result = FAILURE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
[ https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400847#comment-13400847 ] Todd Lipcon commented on HDFS-3561: --- +1 for setting it to 0 or 1 for the graceful fence attempt. > ZKFC retries for 45 times to connect to other NN during fencing when network > between NNs broken and standby Nn will not take over as active > > > Key: HDFS-3561 > URL: https://issues.apache.org/jira/browse/HDFS-3561 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: suja s >Assignee: Vinay > > Scenario: > Active NN on machine1 > Standby NN on machine2 > Machine1 is isolated from the network (machine1 network cable unplugged) > After zk session timeout ZKFC at machine2 side gets notification that NN1 is > not there. > ZKFC tries to failover NN2 as active. > As part of this during fencing it tries to connect to machine1 and kill NN1. > (sshfence technique configured) > This connection retry happens for 45 times( as it takes > ipc.client.connect.max.socket.retries) > Also after that standby NN is not able to take over as active (because of > fencing failure). > Suggestion: If ZKFC is not able to reach other NN for specified time/no of > retries it can consider that NN as dead and instruct the other NN to take > over as active as there is no chance of the other NN (NN1) retaining its > state as active after zk session timeout when its isolated from network > From ZKFC log: > {noformat} > 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). > 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). > 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). > 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). > 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). > 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). > 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). > 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). > 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). > 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). > {noformat} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
[ https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400833#comment-13400833 ] Aaron T. Myers commented on HDFS-3561: -- bq. Suggestion: If ZKFC is not able to reach other NN for specified time/no of retries it can consider that NN as dead and instruct the other NN to take over as active as there is no chance of the other NN (NN1) retaining its state as active after zk session timeout when its isolated from network This isn't acceptable. The point of fencing is to ensure that if the previously-active NN returns from appearing to have been down, it doesn't start writing to the shared directory again while the new active is also writing to that directory. bq. I think we can set retries to 1/2 for avoiding unnecessary actions on small nw fluctuations? or we can set it to 0 as we are already setting the same values in ConfiguredFailoverProxyProvider for failover clients. We set it to 0 in ConfiguredFailoverProxyProvider because we want to trying failing over immediately as the retry mechanism, instead of repeatedly trying to contact a machine that may in fact be completely down. I agree, though, that setting it to a lower number than 45 makes sense in the case of the client in the ZKFC, and perhaps making it configurable separately. > ZKFC retries for 45 times to connect to other NN during fencing when network > between NNs broken and standby Nn will not take over as active > > > Key: HDFS-3561 > URL: https://issues.apache.org/jira/browse/HDFS-3561 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: suja s >Assignee: Vinay > > Scenario: > Active NN on machine1 > Standby NN on machine2 > Machine1 is isolated from the network (machine1 network cable unplugged) > After zk session timeout ZKFC at machine2 side gets notification that NN1 is > not there. > ZKFC tries to failover NN2 as active. > As part of this during fencing it tries to connect to machine1 and kill NN1. > (sshfence technique configured) > This connection retry happens for 45 times( as it takes > ipc.client.connect.max.socket.retries) > Also after that standby NN is not able to take over as active (because of > fencing failure). > Suggestion: If ZKFC is not able to reach other NN for specified time/no of > retries it can consider that NN as dead and instruct the other NN to take > over as active as there is no chance of the other NN (NN1) retaining its > state as active after zk session timeout when its isolated from network > From ZKFC log: > {noformat} > 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). > 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). > 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). > 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). > 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). > 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). > 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). > 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). > 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). > 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). > {noformat} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command
[ https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400819#comment-13400819 ] Daryn Sharp commented on HDFS-3557: --- Erg, forgot the escapes: {code}hadoop -ls '/foobar/\{18,19,20\}'{code} > provide means of escaping special characters to `hadoop fs` command > --- > > Key: HDFS-3557 > URL: https://issues.apache.org/jira/browse/HDFS-3557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jeff Hodges >Priority: Minor > > When running an investigative job, I used a date parameter that selected > multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It > used this same date parameter when creating the output directory. > But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex > operator "?" and mv to change the name (that is, `-mv > output/2012/06/?18,19,20? foobar"). > Shells and filesystems for other systems provide a means of escaping "special > characters" generically, but there seems to be no such means in HDFS/`hadoop > fs`. Providing one would be a great way to make accessing HDFS more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command
[ https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400816#comment-13400816 ] Daryn Sharp commented on HDFS-3557: --- Try placing quotes around the paths, otherwise your unix shell is expanding the glob instead of hadoop expanding the glob. Ie. {code}hadoop -ls '/foobar/{18,19,20}'{code} > provide means of escaping special characters to `hadoop fs` command > --- > > Key: HDFS-3557 > URL: https://issues.apache.org/jira/browse/HDFS-3557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jeff Hodges >Priority: Minor > > When running an investigative job, I used a date parameter that selected > multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It > used this same date parameter when creating the output directory. > But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex > operator "?" and mv to change the name (that is, `-mv > output/2012/06/?18,19,20? foobar"). > Shells and filesystems for other systems provide a means of escaping "special > characters" generically, but there seems to be no such means in HDFS/`hadoop > fs`. Providing one would be a great way to make accessing HDFS more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400792#comment-13400792 ] Hudson commented on HDFS-3549: -- Integrated in Hadoop-Hdfs-trunk-Commit #2455 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2455/]) HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason Lowe via daryn) (Revision 1353695) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3563) Fix findbug warnings in raid
[ https://issues.apache.org/jira/browse/HDFS-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe reassigned HDFS-3563: Assignee: Weiyan Wang Weiyan, could you look into the findbugs warnings at some point? Thanks! > Fix findbug warnings in raid > > > Key: HDFS-3563 > URL: https://issues.apache.org/jira/browse/HDFS-3563 > Project: Hadoop HDFS > Issue Type: Bug > Components: contrib/raid >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Weiyan Wang > > MAPREDUCE-3868 re-enabled raid but introduced 31 new findbugs warnings. > Those warnings should be fixed or appropriate items placed in an exclude file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3557) provide means of escaping special characters to `hadoop fs` command
[ https://issues.apache.org/jira/browse/HDFS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400774#comment-13400774 ] Jeff Hodges commented on HDFS-3557: --- The version is cdhu3.2 and the commands are (where /foobar/{18,19,20} is the name of a real directory created by a mapreduce job, and not multiple ones) {code} hadoop -ls /foobar/{18,19,20} hadoop -mv /foobar/{18,19,20} /foobar/new hadoop -rmr /foobar/{18,19,20} {code} All fail with errors that say the directories don't exist or that selecting multiple directories does not work. > provide means of escaping special characters to `hadoop fs` command > --- > > Key: HDFS-3557 > URL: https://issues.apache.org/jira/browse/HDFS-3557 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jeff Hodges >Priority: Minor > > When running an investigative job, I used a date parameter that selected > multiple directories for the input (e.g. "my_data/2012/06/{18,19,20}"). It > used this same date parameter when creating the output directory. > But `hadoop fs` was unable to ls, getmerge, or rmr it until I used the regex > operator "?" and mv to change the name (that is, `-mv > output/2012/06/?18,19,20? foobar"). > Shells and filesystems for other systems provide a means of escaping "special > characters" generically, but there seems to be no such means in HDFS/`hadoop > fs`. Providing one would be a great way to make accessing HDFS more robust. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400771#comment-13400771 ] Hudson commented on HDFS-3549: -- Integrated in Hadoop-Common-trunk-Commit #2385 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2385/]) HDFS-3549. Fix dist tar build fails in hadoop-hdfs-raid project. (Jason Lowe via daryn) (Revision 1353695) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353695 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/pom.xml * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3563) Fix findbug warnings in raid
Jason Lowe created HDFS-3563: Summary: Fix findbug warnings in raid Key: HDFS-3563 URL: https://issues.apache.org/jira/browse/HDFS-3563 Project: Hadoop HDFS Issue Type: Bug Components: contrib/raid Affects Versions: 3.0.0 Reporter: Jason Lowe MAPREDUCE-3868 re-enabled raid but introduced 31 new findbugs warnings. Those warnings should be fixed or appropriate items placed in an exclude file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400757#comment-13400757 ] Jason Lowe commented on HDFS-3549: -- Thanks Daryn! Filed HDFS-3563 to track fixing the 31 findbugs warnings. > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3549: -- Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk, thanks Jason. > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3551) WebHDFS CREATE does not use client location for redirection
[ https://issues.apache.org/jira/browse/HDFS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400699#comment-13400699 ] Suresh Srinivas commented on HDFS-3551: --- Nicholas, took a quick look at the patch. It looks good. Can you please add some tests? > WebHDFS CREATE does not use client location for redirection > --- > > Key: HDFS-3551 > URL: https://issues.apache.org/jira/browse/HDFS-3551 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Attachments: h3551_20120620.patch > > > CREATE currently redirects client to a random datanode but not using the > client location information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3481) Refactor HttpFS handling of JAX-RS query string parameters
[ https://issues.apache.org/jira/browse/HDFS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400697#comment-13400697 ] Hadoop QA commented on HDFS-3481: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/1258/HDFS-3481.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. -1 javac. The applied patch generated 2070 javac compiler warnings (more than the trunk's current 2053 warnings). +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2694//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2694//artifact/trunk/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2694//console This message is automatically generated. > Refactor HttpFS handling of JAX-RS query string parameters > -- > > Key: HDFS-3481 > URL: https://issues.apache.org/jira/browse/HDFS-3481 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.1-alpha >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.0.1-alpha > > Attachments: HDFS-3481.patch, HDFS-3481.patch, HDFS-3481.patch > > > Explicit parameters in the HttpFSServer became quite messy as they are the > union of all possible parameters for all operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3491) HttpFs does not set permissions correctly
[ https://issues.apache.org/jira/browse/HDFS-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HDFS-3491: - Attachment: HDFS-3491.patch updated patch adds testcase for octal shorts. > HttpFs does not set permissions correctly > - > > Key: HDFS-3491 > URL: https://issues.apache.org/jira/browse/HDFS-3491 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.0-alpha >Reporter: Romain Rigaux >Assignee: Alejandro Abdelnur > Attachments: HDFS-3491.patch, HDFS-3491.patch > > > HttpFs seems to have these problems: > # can't set permissions to 777 at file creation or 1777 with setpermission > # does not accept 01777 permissions (which is valid in WebHdfs) > WebHdfs > curl -X PUT > "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?permission=1777&op=MKDIRS&user.name=hue&doas=hue"; > {"boolean":true} > curl > "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?op=GETFILESTATUS&user.name=hue&doas=hue"; > {"FileStatus":{"accessTime":0,"blockSize":0,"group":"supergroup","length":0,"modificationTime":1338581075040,"owner":"hue","pathSuffix":"","permission":"1777","replication":0,"type":"DIRECTORY"}} > curl -X PUT > "http://localhost:50070/webhdfs/v1/tmp/test-perm-webhdfs?permission=01777&op=MKDIRS&user.name=hue&doas=hue"; > {"boolean":true} > HttpFs > curl -X PUT > "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?permission=1777&op=MKDIRS&user.name=hue&doas=hue"; > {"boolean":true} > curl > "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=GETFILESTATUS&user.name=hue&doas=hue"; > {"FileStatus":{"pathSuffix":"","type":"DIRECTORY","length":0,"owner":"hue","group":"supergroup","permission":"755","accessTime":0,"modificationTime":1338580912205,"blockSize":0,"replication":0}} > curl -X PUT > "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=SETPERMISSION&PERMISSION=1777&user.name=hue&doas=hue"; > curl > "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?op=GETFILESTATUS&user.name=hue&doas=hue"; > {"FileStatus":{"pathSuffix":"","type":"DIRECTORY","length":0,"owner":"hue","group":"supergroup","permission":"777","accessTime":0,"modificationTime":1338581075040,"blockSize":0,"replication":0}} > curl -X PUT > "http://localhost:14000/webhdfs/v1/tmp/test-perm-httpfs?permission=01777&op=MKDIRS&user.name=hue&doas=hue"; > {"RemoteException":{"message":"java.lang.IllegalArgumentException: Parameter > [permission], invalid value [01777], value must be > [default|[0-1]?[0-7][0-7][0-7]]","exception":"QueryParamException","javaClassName":"com.sun.jersey.api.ParamException$QueryParamException"}} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3481) Refactor HttpFS handling of JAX-RS query string parameters
[ https://issues.apache.org/jira/browse/HDFS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alejandro Abdelnur updated HDFS-3481: - Attachment: HDFS-3481.patch Thx Eli. The attached patch removes the commented return in the testcase (there was only one occurrence of this). The Parameter class, while a simple wrapper on Map, given its generic method simplifies access to parameter values significantly, making the code cleaner, for example: Using Parameters: {code} String doAs = params.get(DoAsParam.NAME, DoAsParam.class); {code} Using Map: {code} String doAs = ((DoAsParam.class)map.get(DoAsParam.NAME, DoAsParam.class)).value(); {code} And it also removes access to all the Map API which are not relevant for this use (if we use Map we'd had to wrap it in an unmodifiable MAP to avoid). Regarding Using Guava ImmutableMap.of(), I'm getting similar warnings. Finally, regarding sharing Param code with webhdfs. The idea is, once that webhdfs and httpfs are 100% equivalent from a functional perspective (HDFS-3113 & HDFS-3509 would achieve that), then we can tackle unify the code (HDFS-2645). > Refactor HttpFS handling of JAX-RS query string parameters > -- > > Key: HDFS-3481 > URL: https://issues.apache.org/jira/browse/HDFS-3481 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.1-alpha >Reporter: Alejandro Abdelnur >Assignee: Alejandro Abdelnur > Fix For: 2.0.1-alpha > > Attachments: HDFS-3481.patch, HDFS-3481.patch, HDFS-3481.patch > > > Explicit parameters in the HttpFSServer became quite messy as they are the > union of all possible parameters for all operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3370) HDFS hardlink
[ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400658#comment-13400658 ] Sanjay Radia commented on HDFS-3370: Konstantine * How can one implement hard links in a library? If you have an alternate library implementation in mind please explain. * I am fine to have hard links and renames restricted to volumes; this should then give you freedom to implemented a distributed NN. > HDFS hardlink > - > > Key: HDFS-3370 > URL: https://issues.apache.org/jira/browse/HDFS-3370 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Hairong Kuang >Assignee: Liyin Tang > Attachments: HDFS-HardLink.pdf > > > We'd like to add a new feature hardlink to HDFS that allows harlinked files > to share data without copying. Currently we will support hardlinking only > closed files, but it could be extended to unclosed files as well. > Among many potential use cases of the feature, the following two are > primarily used in facebook: > 1. This provides a lightweight way for applications like hbase to create a > snapshot; > 2. This also allows an application like Hive to move a table to a different > directory without breaking current running hive queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2881) org.apache.hadoop.hdfs.TestDatanodeBlockScanner Fails Intermittently
[ https://issues.apache.org/jira/browse/HDFS-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400637#comment-13400637 ] Kihwal Lee commented on HDFS-2881: -- It failed in one of the precommit builds. It looks different this time. https://builds.apache.org/job/PreCommit-HDFS-Build/2683//testReport/ While waiting for the two bad replicas, the blocks were fixed. So waitCorruptReplicas() only saw < 2 in each loop. Moreover, the first corrupt block was reported by DFSClient while this method was reading the file and got fixed (rereplicate/invalidate) before it looped, without involving BlockScanner. > org.apache.hadoop.hdfs.TestDatanodeBlockScanner Fails Intermittently > > > Key: HDFS-2881 > URL: https://issues.apache.org/jira/browse/HDFS-2881 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.24.0 >Reporter: Robert Joseph Evans > Attachments: > TEST-org.apache.hadoop.hdfs.TestDatanodeBlockScanner.xml, > org.apache.hadoop.hdfs.TestDatanodeBlockScanner-output.txt, > org.apache.hadoop.hdfs.TestDatanodeBlockScanner.txt > > > org.apache.hadoop.hdfs.TestDatanodeBlockScanner fails intermittently durring > test-patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3550) raid added javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400600#comment-13400600 ] Hudson commented on HDFS-3550: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2400 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2400/]) HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 1353592) Result = FAILURE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > raid added javadoc warnings > --- > > Key: HDFS-3550 > URL: https://issues.apache.org/jira/browse/HDFS-3550 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Thomas Graves >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3550.patch > > > hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the > following javadoc warnings and now all the builds complain about them: > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180: > warning - @param argument "parityFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340: > warning - @param argument "srcFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA,
[jira] [Commented] (HDFS-3475) Make the replication monitor multipliers configurable
[ https://issues.apache.org/jira/browse/HDFS-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400599#comment-13400599 ] Aaron T. Myers commented on HDFS-3475: -- One small comment: I think you should add some info to the hdfs-default.xml description for "{{dfs.namenode.invalidate.work.pct.per.iteration}}" saying that the value should be between 0-100, or whatever's appropriate. For that matter, since this is a brand new config, you might want to change it to be in the range 0 - 1.0, which I think is a more common way in the Hadoop code base to represent percentages. Other than that the patch looks good. +1 pending a fix for the above and an explanation of the two test failures. > Make the replication monitor multipliers configurable > - > > Key: HDFS-3475 > URL: https://issues.apache.org/jira/browse/HDFS-3475 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Assignee: Harsh J >Priority: Trivial > Attachments: HDFS-3475.patch, HDFS-3475.patch > > > BlockManager currently hardcodes the following two constants: > {code} > private static final int INVALIDATE_WORK_PCT_PER_ITERATION = 32; > private static final int REPLICATION_WORK_MULTIPLIER_PER_ITERATION = 2; > {code} > These are used to throttle/limit the amount of deletion and > replication-to-other-DN work done per heartbeat interval of a live DN. > Not many have had reasons to want these changed so far but there have been a > few requests I've faced over the past year from a variety of clusters I've > helped maintain. I think with the improvements in disks and network thats > already started to be rolled out in production environments out there, > changing these may start making sense to some. > Lets at least make it advanced-configurable with proper docs that warn > adequately, with the defaults being what they are today. With hardcodes, it > comes down to a recompile for admins, which is not something they may like. > Please let me know your thoughts. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3464) BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can throw BKNoSuchLedgerExistsException later.
[ https://issues.apache.org/jira/browse/HDFS-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G resolved HDFS-3464. --- Resolution: Fixed > BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can > throw BKNoSuchLedgerExistsException later. > - > > Key: HDFS-3464 > URL: https://issues.apache.org/jira/browse/HDFS-3464 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G > > HDFS-3058 will clean currentLedgers on exception. > In BookKeeperJournalManager, startLogSegment() is deleting the corresponding > 'inprogress_ledger' ledger on exception. Here leaving the 'inprogress_x' > ledger metadata in ZooKeeper. When the other node becomes active, he will see > the 'inprogress_x' znode and tries to recoverLastTxId() it would throw > exception, since there is no 'inprogress_ledger' exists. > {noformat} > Caused by: > org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException > at > org.apache.bookkeeper.client.BookKeeper.openLedger(BookKeeper.java:393) > at > org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:493) > {noformat} > As per the discussion in HDFS-3058, we will handle the coment as part of this > JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3549) dist tar build fails in hadoop-hdfs-raid project
[ https://issues.apache.org/jira/browse/HDFS-3549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400550#comment-13400550 ] Daryn Sharp commented on HDFS-3549: --- +1 TestRaidNode appears to fail due to race condition with querying the job status. Findbugs warnings are of course due to making findbugs work again. > dist tar build fails in hadoop-hdfs-raid project > > > Key: HDFS-3549 > URL: https://issues.apache.org/jira/browse/HDFS-3549 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Jason Lowe >Priority: Critical > Attachments: HDFS-3549.patch, HDFS-3549.patch, HDFS-3549.patch, > HDFS-3549.patch > > > Trying to build the distribution tarball in a clean tree via {{mvn install > -Pdist -Dtar -DskipTests -Dmaven.javadoc.skip}} fails with this error: > {noformat} > main: > [exec] tar: hadoop-hdfs-raid-3.0.0-SNAPSHOT: Cannot stat: No such file > or directory > [exec] tar: Exiting with failure status due to previous errors > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3550) raid added javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400547#comment-13400547 ] Hudson commented on HDFS-3550: -- Integrated in Hadoop-Common-trunk-Commit #2381 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2381/]) HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 1353592) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > raid added javadoc warnings > --- > > Key: HDFS-3550 > URL: https://issues.apache.org/jira/browse/HDFS-3550 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Thomas Graves >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3550.patch > > > hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the > following javadoc warnings and now all the builds complain about them: > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180: > warning - @param argument "parityFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340: > warning - @param argument "srcFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see:
[jira] [Commented] (HDFS-3550) raid added javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400543#comment-13400543 ] Hudson commented on HDFS-3550: -- Integrated in Hadoop-Hdfs-trunk-Commit #2451 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2451/]) HDFS-3550. Fix raid javadoc warnings. (Jason Lowe via daryn) (Revision 1353592) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1353592 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > raid added javadoc warnings > --- > > Key: HDFS-3550 > URL: https://issues.apache.org/jira/browse/HDFS-3550 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Thomas Graves >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3550.patch > > > hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the > following javadoc warnings and now all the builds complain about them: > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180: > warning - @param argument "parityFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340: > warning - @param argument "srcFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http
[jira] [Commented] (HDFS-3464) BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can throw BKNoSuchLedgerExistsException later.
[ https://issues.apache.org/jira/browse/HDFS-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400541#comment-13400541 ] Uma Maheswara Rao G commented on HDFS-3464: --- Yep, We should not get this situation now after handling the specialized exceptions. I will close this JIRA. > BKJM: Deleting currentLedger and leaving 'inprogress_x' on exceptions can > throw BKNoSuchLedgerExistsException later. > - > > Key: HDFS-3464 > URL: https://issues.apache.org/jira/browse/HDFS-3464 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 2.0.1-alpha, 3.0.0 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G > > HDFS-3058 will clean currentLedgers on exception. > In BookKeeperJournalManager, startLogSegment() is deleting the corresponding > 'inprogress_ledger' ledger on exception. Here leaving the 'inprogress_x' > ledger metadata in ZooKeeper. When the other node becomes active, he will see > the 'inprogress_x' znode and tries to recoverLastTxId() it would throw > exception, since there is no 'inprogress_ledger' exists. > {noformat} > Caused by: > org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException > at > org.apache.bookkeeper.client.BookKeeper.openLedger(BookKeeper.java:393) > at > org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.recoverLastTxId(BookKeeperJournalManager.java:493) > {noformat} > As per the discussion in HDFS-3058, we will handle the coment as part of this > JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3562) Handle disconnect and session timeout events at BKJM
[ https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400534#comment-13400534 ] Uma Maheswara Rao G commented on HDFS-3562: --- Just moved this issue to verify one of the INFRA bug for moving the issue from BK to here. > Handle disconnect and session timeout events at BKJM > > > Key: HDFS-3562 > URL: https://issues.apache.org/jira/browse/HDFS-3562 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Vinay >Assignee: Vinay > > # Retry zookeeper operations for some amount of time in case of > CONNECTIONLOSS/OPERATIONTIMEOUT exceptions. > # In case of Session expiry trigger shutdown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3562) Handle disconnect and session timeout events at BKJM
[ https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G reassigned HDFS-3562: - Assignee: Vinay > Handle disconnect and session timeout events at BKJM > > > Key: HDFS-3562 > URL: https://issues.apache.org/jira/browse/HDFS-3562 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Vinay >Assignee: Vinay > > # Retry zookeeper operations for some amount of time in case of > CONNECTIONLOSS/OPERATIONTIMEOUT exceptions. > # In case of Session expiry trigger shutdown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3562) Handle disconnect and session timeout events at BKJM
[ https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-3562: -- Issue Type: Sub-task (was: Bug) Parent: HDFS-3399 > Handle disconnect and session timeout events at BKJM > > > Key: HDFS-3562 > URL: https://issues.apache.org/jira/browse/HDFS-3562 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Vinay > > # Retry zookeeper operations for some amount of time in case of > CONNECTIONLOSS/OPERATIONTIMEOUT exceptions. > # In case of Session expiry trigger shutdown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3550) raid added javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daryn Sharp updated HDFS-3550: -- Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I have committed to trunk, thanks Jason! > raid added javadoc warnings > --- > > Key: HDFS-3550 > URL: https://issues.apache.org/jira/browse/HDFS-3550 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Thomas Graves >Assignee: Jason Lowe >Priority: Critical > Fix For: 3.0.0 > > Attachments: HDFS-3550.patch > > > hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the > following javadoc warnings and now all the builds complain about them: > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180: > warning - @param argument "parityFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340: > warning - @param argument "srcFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-3562) Handle disconnect and session timeout events at BKJM
[ https://issues.apache.org/jira/browse/HDFS-3562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G moved BOOKKEEPER-316 to HDFS-3562: -- Target Version/s: 2.0.1-alpha, 3.0.0 Key: HDFS-3562 (was: BOOKKEEPER-316) Project: Hadoop HDFS (was: Bookkeeper) > Handle disconnect and session timeout events at BKJM > > > Key: HDFS-3562 > URL: https://issues.apache.org/jira/browse/HDFS-3562 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinay > > # Retry zookeeper operations for some amount of time in case of > CONNECTIONLOSS/OPERATIONTIMEOUT exceptions. > # In case of Session expiry trigger shutdown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3530) TestFileAppend2.testComplexAppend occasionally fails
[ https://issues.apache.org/jira/browse/HDFS-3530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400532#comment-13400532 ] Tomohiko Kinebuchi commented on HDFS-3530: -- It seems that this tests failed only once as far as I can track on the Jenkins test result history. I was trying to reproduce this failure, but have not been successful so far. Then, I am now inspecting log messages. Does someone know what this test case tests? > TestFileAppend2.testComplexAppend occasionally fails > > > Key: HDFS-3530 > URL: https://issues.apache.org/jira/browse/HDFS-3530 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Eli Collins >Assignee: Tomohiko Kinebuchi > Attachments: HDFS-3530-for-debug.txt, PreCommit-HADOOP-Build #1116 > test - testComplexAppend.html.gz > > > TestFileAppend2.testComplexAppend occasionally fails with the following: > junit.framework.AssertionFailedError: testComplexAppend Worker encountered > exceptions. > at junit.framework.Assert.fail(Assert.java:47) > at junit.framework.Assert.assertTrue(Assert.java:20) > at > org.apache.hadoop.hdfs.TestFileAppend2.testComplexAppend(TestFileAppend2.java:385) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3547) Handle disconnect and session timeout events at BKJM
[ https://issues.apache.org/jira/browse/HDFS-3547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-3547: -- Issue Type: Bug (was: Sub-task) Parent: (was: HDFS-3399) > Handle disconnect and session timeout events at BKJM > > > Key: HDFS-3547 > URL: https://issues.apache.org/jira/browse/HDFS-3547 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Vinay >Assignee: Vinay > > # Retry zookeeper operations for some amount of time in case of > CONNECTIONLOSS/OPERATIONTIMEOUT exceptions. > # In case of Session expiry trigger shutdown -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
[ https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400504#comment-13400504 ] Uma Maheswara Rao G commented on HDFS-3561: --- I think we can set retries to 1/2 for avoiding unnecessary actions on small nw fluctuations? or we can set it to 0 as we are already setting the same values in ConfiguredFailoverProxyProvider for failover clients. {code} public static final String DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_KEY = "dfs.client.failover.connection.retries"; public static final int DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_DEFAULT = 0; public static final String DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_ON_SOCKET_TIMEOUTS_KEY = "dfs.client.failover.connection.retries.on.timeouts"; public static final int DFS_CLIENT_FAILOVER_CONNECTION_RETRIES_ON_SOCKET_TIMEOUTS_DEFAULT = 0; {code} > ZKFC retries for 45 times to connect to other NN during fencing when network > between NNs broken and standby Nn will not take over as active > > > Key: HDFS-3561 > URL: https://issues.apache.org/jira/browse/HDFS-3561 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: suja s >Assignee: Vinay > > Scenario: > Active NN on machine1 > Standby NN on machine2 > Machine1 is isolated from the network (machine1 network cable unplugged) > After zk session timeout ZKFC at machine2 side gets notification that NN1 is > not there. > ZKFC tries to failover NN2 as active. > As part of this during fencing it tries to connect to machine1 and kill NN1. > (sshfence technique configured) > This connection retry happens for 45 times( as it takes > ipc.client.connect.max.socket.retries) > Also after that standby NN is not able to take over as active (because of > fencing failure). > Suggestion: If ZKFC is not able to reach other NN for specified time/no of > retries it can consider that NN as dead and instruct the other NN to take > over as active as there is no chance of the other NN (NN1) retaining its > state as active after zk session timeout when its isolated from network > From ZKFC log: > {noformat} > 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). > 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). > 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). > 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). > 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). > 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). > 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). > 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). > 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). > 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). > {noformat} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3550) raid added javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-3550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400501#comment-13400501 ] Daryn Sharp commented on HDFS-3550: --- +1 Thanks for removing the warnings introduced by raid. > raid added javadoc warnings > --- > > Key: HDFS-3550 > URL: https://issues.apache.org/jira/browse/HDFS-3550 > Project: Hadoop HDFS > Issue Type: Bug > Components: build >Affects Versions: 3.0.0 >Reporter: Thomas Graves >Assignee: Jason Lowe >Priority: Critical > Attachments: HDFS-3550.patch > > > hdfs raid which I believe was introduced by MAPREDUCE-3868 has added the > following javadoc warnings and now all the builds complain about them: > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Decoder.java:180: > warning - @param argument "parityFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/Encoder.java:340: > warning - @param argument "srcFile" is not a parameter name. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:58: > warning - @inheritDocs is an unknown tag. > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/RaidConfigurationException.java:24: > warning - Tag @link: reference not found: CronNode > [WARNING] > /home/jenkins/jenkins-slave/workspace/PreCommit-MAPREDUCE-Build/trunk/hadoop-hdfs-project/hadoop-hdfs-raid/src/main/java/org/apache/hadoop/raid/DistRaidNode.java:71: > warning - @inheritDocs is an unknown tag. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3554) TestRaidNode is failing
[ https://issues.apache.org/jira/browse/HDFS-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400500#comment-13400500 ] Robert Joseph Evans commented on HDFS-3554: --- It looks like there is no history server up and running. In Yarn there is a race in the client. If the client asks for status if the AM is still up and running then it will talk to the AM. If it has exited, which it tends to do when the MR job has completed then the client will fall over to the history server. It looks like while you are running using the minicluster there is no corresponding history server to fulfill the request. > TestRaidNode is failing > --- > > Key: HDFS-3554 > URL: https://issues.apache.org/jira/browse/HDFS-3554 > Project: Hadoop HDFS > Issue Type: Bug > Components: contrib/raid, test >Affects Versions: 3.0.0 >Reporter: Jason Lowe >Assignee: Weiyan Wang > > After MAPREDUCE-3868 re-enabled raid, TestRaidNode has been failing in > Jenkins builds. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
[ https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400487#comment-13400487 ] Vinay commented on HDFS-3561: - During transition, fencing of old active will be done. Here before actually using the fencing method configured, gracefull fencing will be tried. Now zkfc will try to get the proxy of other machine Namenode. since the n/w is down, it is not able to get the connection and it is retrying for 45 times configured using *ipc.client.connect.max.retries.on.timeouts* {code}LOG.info("Should fence: " + target); boolean gracefulWorked = new FailoverController(conf, RequestSource.REQUEST_BY_ZKFC).tryGracefulFence(target); if (gracefulWorked) { // It's possible that it's in standby but just about to go into active, // no? Is there some race here? LOG.info("Successfully transitioned " + target + " to standby " + "state without fencing"); return; }{code} I think in ZKFC case we can reduce the number of retries. Any thoughts? > ZKFC retries for 45 times to connect to other NN during fencing when network > between NNs broken and standby Nn will not take over as active > > > Key: HDFS-3561 > URL: https://issues.apache.org/jira/browse/HDFS-3561 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: suja s >Assignee: Vinay > > Scenario: > Active NN on machine1 > Standby NN on machine2 > Machine1 is isolated from the network (machine1 network cable unplugged) > After zk session timeout ZKFC at machine2 side gets notification that NN1 is > not there. > ZKFC tries to failover NN2 as active. > As part of this during fencing it tries to connect to machine1 and kill NN1. > (sshfence technique configured) > This connection retry happens for 45 times( as it takes > ipc.client.connect.max.socket.retries) > Also after that standby NN is not able to take over as active (because of > fencing failure). > Suggestion: If ZKFC is not able to reach other NN for specified time/no of > retries it can consider that NN as dead and instruct the other NN to take > over as active as there is no chance of the other NN (NN1) retaining its > state as active after zk session timeout when its isolated from network > From ZKFC log: > {noformat} > 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). > 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). > 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). > 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). > 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). > 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). > 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). > 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). > 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). > 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). > {noformat} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
[ https://issues.apache.org/jira/browse/HDFS-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G reassigned HDFS-3561: - Assignee: Vinay Good catch Suja. Thanks for filing the JIRA. > ZKFC retries for 45 times to connect to other NN during fencing when network > between NNs broken and standby Nn will not take over as active > > > Key: HDFS-3561 > URL: https://issues.apache.org/jira/browse/HDFS-3561 > Project: Hadoop HDFS > Issue Type: Bug > Components: auto-failover >Reporter: suja s >Assignee: Vinay > > Scenario: > Active NN on machine1 > Standby NN on machine2 > Machine1 is isolated from the network (machine1 network cable unplugged) > After zk session timeout ZKFC at machine2 side gets notification that NN1 is > not there. > ZKFC tries to failover NN2 as active. > As part of this during fencing it tries to connect to machine1 and kill NN1. > (sshfence technique configured) > This connection retry happens for 45 times( as it takes > ipc.client.connect.max.socket.retries) > Also after that standby NN is not able to take over as active (because of > fencing failure). > Suggestion: If ZKFC is not able to reach other NN for specified time/no of > retries it can consider that NN as dead and instruct the other NN to take > over as active as there is no chance of the other NN (NN1) retaining its > state as active after zk session timeout when its isolated from network > From ZKFC log: > {noformat} > 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). > 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). > 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). > 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). > 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). > 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). > 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). > 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). > 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). > 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect > to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). > {noformat} > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3558) OfflineImageViewer throws an NPE
[ https://issues.apache.org/jira/browse/HDFS-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400461#comment-13400461 ] Daryn Sharp commented on HDFS-3558: --- +1 Although I'd consider putting the annotation {{@VisibleForTesting}} on the method {{processDelegationTokens}} whose scope was relaxed for testing. > OfflineImageViewer throws an NPE > > > Key: HDFS-3558 > URL: https://issues.apache.org/jira/browse/HDFS-3558 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.3 >Reporter: Ravi Prakash >Assignee: Ravi Prakash > Attachments: HDFS-3558.branch-23.patch > > > Courtesy [~mithun] > Exception in thread "main" java.lang.NullPointerException > at > org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:371) > at org.apache.hadoop.security.User.(User.java:48) > at org.apache.hadoop.security.User.(User.java:43) > at > org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:857) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier.getUser(AbstractDelegationTokenIdentifier.java:91) > at > org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier.toString(DelegationTokenIdentifier.java:61) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.processDelegationTokens(ImageLoaderCurrent.java:222) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.ImageLoaderCurrent.loadImage(ImageLoaderCurrent.java:185) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.go(OfflineImageViewer.java:129) > at > org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer.main(OfflineImageViewer.java:250) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1469) TestBlockTokenWithDFS fails on trunk
[ https://issues.apache.org/jira/browse/HDFS-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400377#comment-13400377 ] Junping Du commented on HDFS-1469: -- I just check on trunk that unit test TestBlockTokenWithDFS can pass. So somebody can mark it resolved and close it? > TestBlockTokenWithDFS fails on trunk > > > Key: HDFS-1469 > URL: https://issues.apache.org/jira/browse/HDFS-1469 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.22.0 >Reporter: Todd Lipcon >Assignee: Konstantin Boudnik >Priority: Blocker > Attachments: failed-TestBlockTokenWithDFS.txt, log.gz > > > TestBlockTokenWithDFS is failing on trunk: > Testcase: testAppend took 31.569 sec > FAILED > null > junit.framework.AssertionFailedError: null > at > org.apache.hadoop.hdfs.server.namenode.TestBlockTokenWithDFS.testAppend(TestBlockTokenWithDFS.java:223) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400363#comment-13400363 ] Junping Du commented on HDFS-3498: -- File a separated JIRA HADOOP-8526 to fix this issue. > Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass > > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3561) ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active
suja s created HDFS-3561: Summary: ZKFC retries for 45 times to connect to other NN during fencing when network between NNs broken and standby Nn will not take over as active Key: HDFS-3561 URL: https://issues.apache.org/jira/browse/HDFS-3561 Project: Hadoop HDFS Issue Type: Bug Components: auto-failover Reporter: suja s Scenario: Active NN on machine1 Standby NN on machine2 Machine1 is isolated from the network (machine1 network cable unplugged) After zk session timeout ZKFC at machine2 side gets notification that NN1 is not there. ZKFC tries to failover NN2 as active. As part of this during fencing it tries to connect to machine1 and kill NN1. (sshfence technique configured) This connection retry happens for 45 times( as it takes ipc.client.connect.max.socket.retries) Also after that standby NN is not able to take over as active (because of fencing failure). Suggestion: If ZKFC is not able to reach other NN for specified time/no of retries it can consider that NN as dead and instruct the other NN to take over as active as there is no chance of the other NN (NN1) retaining its state as active after zk session timeout when its isolated from network >From ZKFC log: {noformat} 2012-06-21 17:46:14,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 22 time(s). 2012-06-21 17:46:35,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 23 time(s). 2012-06-21 17:46:56,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 24 time(s). 2012-06-21 17:47:17,378 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 25 time(s). 2012-06-21 17:47:38,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 26 time(s). 2012-06-21 17:47:59,382 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 27 time(s). 2012-06-21 17:48:20,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 28 time(s). 2012-06-21 17:48:41,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 29 time(s). 2012-06-21 17:49:02,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 30 time(s). 2012-06-21 17:49:23,386 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: HOST-xx-xx-xx-102/xx.xx.xx.102:65110. Already tried 31 time(s). {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
[ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13400322#comment-13400322 ] Junping Du commented on HDFS-3498: -- I took a look at the log of javadoc testing: https://builds.apache.org/job/PreCommit-HDFS-Build/2693/artifact/trunk/patchprocess/patchJavadocWarnings.txt and it looks like 13 java warnings are all in hadoop-hdfs-raid. Is there anything I should fix in this patch? > Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass > > > Key: HDFS-3498 > URL: https://issues.apache.org/jira/browse/HDFS-3498 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Junping Du >Assignee: Junping Du > Attachments: HDFS-3498-v2.patch, HDFS-3498-v3.patch, > HDFS-3498-v4.patch, HDFS-3498.patch, > Hadoop-8471-BlockPlacementDefault-extensible.patch > > > ReplicaPlacementPolicy is already a pluggable component in Hadoop. However, > the Replica Removal Policy is still nested in BlockManager that need to be > separated out into a ReplicaPlacementPolicy then can be override later. Also > it looks like hadoop unit test lack the testing on replica removal policy, so > we add it here. > On the other hand, as a implementation of ReplicaPlacementPolicy, > ReplicaPlacementDefault still show lots of generic for other topology cases > like: virtualization, and we want to make code in > ReplicaPlacementPolicyDefault can be reused as much as possible so a few of > its methods were changed from private to protected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira