[jira] [Created] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
Todd Lipcon created HDFS-3644: - Summary: OEV should recognize and deal with 0.20.20x opcode versions Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-799) libhdfs must call DetachCurrentThread when a thread is destroyed
[ https://issues.apache.org/jira/browse/HDFS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412541#comment-13412541 ] Hadoop QA commented on HDFS-799: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536171/HDFS-799.005.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestDatanodeBlockScanner +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2804//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2804//console This message is automatically generated. libhdfs must call DetachCurrentThread when a thread is destroyed Key: HDFS-799 URL: https://issues.apache.org/jira/browse/HDFS-799 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Christian Kunz Assignee: Colin Patrick McCabe Attachments: HDFS-799.001.patch, HDFS-799.003.patch, HDFS-799.004.patch, HDFS-799.005.patch Threads that call AttachCurrentThread in libhdfs and disappear without calling DetachCurrentThread cause a memory leak. Libhdfs should detach the current thread when this thread exits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3606) libhdfs: create self-contained unit test
[ https://issues.apache.org/jira/browse/HDFS-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412544#comment-13412544 ] Hadoop QA commented on HDFS-3606: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536173/HDFS-3606.004.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.common.TestJspHelper +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2805//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2805//console This message is automatically generated. libhdfs: create self-contained unit test Key: HDFS-3606 URL: https://issues.apache.org/jira/browse/HDFS-3606 Project: Hadoop HDFS Issue Type: Test Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3606.001.patch, HDFS-3606.003.patch, HDFS-3606.004.patch We should have a self-contained unit test for libhdfs and also for FUSE. We do have hdfs_test, but it is not self-contained (it requires a cluster to already be running before it can be used.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3497) Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3497: - Attachment: HDFS-3497.patch Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3497) Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3497: - Status: Patch Available (was: Open) Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 2.0.0-alpha, 1.0.0 Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3497) Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412570#comment-13412570 ] Junping Du commented on HDFS-3497: -- This patch is for adding additional layer of nodegroup to Balancer. This patch is only one part (major part), the other part is tracked in HDFS-3496 Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3497) Update Balancer policy with NodeGroup layer
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3497: - Description: 1. Make sure Network Topology and BlockPlacementPolicy check in balancer is compatible with new one with adding NodeGroup layer. 2. Update balancer policy for performance optimization with Node Group - choose the target and source node on the same node group for balancing as the first priority. 3. Make sure balancing policy will not eliminate reliability on environment with nodegroup (virtualization) that verify good target on NodeGroup relationship. (This part of work is separated out and tracked in HDFS-3496) Summary: Update Balancer policy with NodeGroup layer (was: Update Balancer for performance optimization with Node Group- choose the target and source node on the same node group for balancing as the first priority) Update Balancer policy with NodeGroup layer --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch 1. Make sure Network Topology and BlockPlacementPolicy check in balancer is compatible with new one with adding NodeGroup layer. 2. Update balancer policy for performance optimization with Node Group - choose the target and source node on the same node group for balancing as the first priority. 3. Make sure balancing policy will not eliminate reliability on environment with nodegroup (virtualization) that verify good target on NodeGroup relationship. (This part of work is separated out and tracked in HDFS-3496) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3497) Update Balancer policy with NodeGroup layer
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412613#comment-13412613 ] Hadoop QA commented on HDFS-3497: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536183/HDFS-3497.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.common.TestJspHelper +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2806//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2806//console This message is automatically generated. Update Balancer policy with NodeGroup layer --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch 1. Make sure Network Topology and BlockPlacementPolicy check in balancer is compatible with new one with adding NodeGroup layer. 2. Update balancer policy for performance optimization with Node Group - choose the target and source node on the same node group for balancing as the first priority. 3. Make sure balancing policy will not eliminate reliability on environment with nodegroup (virtualization) that verify good target on NodeGroup relationship. (This part of work is separated out and tracked in HDFS-3496) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3477) FormatZK and ZKFC startup can fail due to zkclient connection establishment delay
[ https://issues.apache.org/jira/browse/HDFS-3477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412675#comment-13412675 ] Rakesh R commented on HDFS-3477: Added links to HDFS-3635 as I feel the cause is same and failing after timeout: {code}java.lang.Exception: test timed out after 3 milliseconds at java.lang.Object.wait(Native Method) at org.apache.hadoop.ha.ZKFailoverController.waitForActiveAttempt(ZKFailoverController.java:457) at org.apache.hadoop.ha.ZKFailoverController.doGracefulFailover(ZKFailoverController.java:645) at org.apache.hadoop.ha.ZKFailoverController.access$400(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:590) at org.apache.hadoop.ha.ZKFailoverController$3.run(ZKFailoverController.java:587){code} FormatZK and ZKFC startup can fail due to zkclient connection establishment delay - Key: HDFS-3477 URL: https://issues.apache.org/jira/browse/HDFS-3477 Project: Hadoop HDFS Issue Type: Sub-task Components: auto-failover Affects Versions: 2.0.1-alpha Reporter: suja s Assignee: Rakesh R Attachments: HDFS-3477.1.patch, HDFS-3477.2.patch, HDFS-3477.3.patch, HDFS-3477.3.patch, HDFS-3477.patch Format and ZKFC startup flows continue further after creation of zkclient connection without waiting to check whether the connection is completely established. This leads to failure at the subsequent point if connection was not complete by then. Exception trace for format {noformat} 12/05/30 19:48:24 INFO zookeeper.ClientCnxn: Socket connection established to HOST-xx-xx-xx-55/xx.xx.xx.55:2182, initiating session 12/05/30 19:48:24 INFO zookeeper.ClientCnxn: Session establishment complete on server HOST-xx-xx-xx-55/xx.xx.xx.55:2182, sessionid = 0x1379da4660c0014, negotiated timeout = 5000 12/05/30 19:48:24 WARN ha.ActiveStandbyElector: Ignoring stale result from old client with sessionId 0x1379da4660c0014 12/05/30 19:48:24 INFO zookeeper.ZooKeeper: Session: 0x1379da4660c0014 closed 12/05/30 19:48:24 INFO zookeeper.ClientCnxn: EventThread shut down Exception in thread main java.io.IOException: Couldn't determine existence of znode '/hadoop-ha/hacluster' at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:263) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:257) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:195) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:58) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:163) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:159) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:159) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:171) Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hadoop-ha/hacluster at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1021) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1049) at org.apache.hadoop.ha.ActiveStandbyElector.parentZNodeExists(ActiveStandbyElector.java:261) ... 8 more {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3645) Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write
Harsh J created HDFS-3645: - Summary: Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write Key: HDFS-3645 URL: https://issues.apache.org/jira/browse/HDFS-3645 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Right now, I think we do too naive a computation for detecting if a chosen DN target is busy by itself. We currently do {{node.getXceiverCount() (2.0 * avgLoad)}}. We should improve on this computation with a more realistic measure of if a DN is really busy by itself or not (rather than checking against cluster average, where there's a good chance the value can be wrong to compare with, for some cases) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3497) Update Balancer policy with NodeGroup layer
[ https://issues.apache.org/jira/browse/HDFS-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412703#comment-13412703 ] Junping Du commented on HDFS-3497: -- The test failure is tracked by HDFS-3625 which is not related to this patch. Update Balancer policy with NodeGroup layer --- Key: HDFS-3497 URL: https://issues.apache.org/jira/browse/HDFS-3497 Project: Hadoop HDFS Issue Type: Sub-task Components: balancer Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Junping Du Assignee: Junping Du Attachments: HDFS-3497.patch 1. Make sure Network Topology and BlockPlacementPolicy check in balancer is compatible with new one with adding NodeGroup layer. 2. Update balancer policy for performance optimization with Node Group - choose the target and source node on the same node group for balancing as the first priority. 3. Make sure balancing policy will not eliminate reliability on environment with nodegroup (virtualization) that verify good target on NodeGroup relationship. (This part of work is separated out and tracked in HDFS-3496) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3615) Two BlockTokenSecretManager findbugs warnings
[ https://issues.apache.org/jira/browse/HDFS-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412714#comment-13412714 ] Hudson commented on HDFS-3615: -- Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HDFS-3615. Two BlockTokenSecretManager findbugs warnings. Contributed by Aaron T. Myers. (Revision 1360255) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360255 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java Two BlockTokenSecretManager findbugs warnings - Key: HDFS-3615 URL: https://issues.apache.org/jira/browse/HDFS-3615 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Aaron T. Myers Fix For: 2.0.1-alpha Attachments: HDFS-3615.patch Looks like two findbugs warnings were introduced recently (see these across a couple recent patches). Unclear what change introduced it as the file hasn't been modified and recent committed changes pass the findbugs check. ISInconsistent synchronization of org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.keyUpdateInterval; locked 75% of time ISInconsistent synchronization of org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.serialNo; locked 75% of time -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3582) Hook daemon process exit for testing
[ https://issues.apache.org/jira/browse/HDFS-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412713#comment-13412713 ] Hudson commented on HDFS-3582: -- Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HDFS-3582. Hook daemon process exit for testing. Contributed by Eli Collins (Revision 1360329) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360329 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureOfSharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHASafeMode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyIsHot.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java Hook daemon process exit for testing - Key: HDFS-3582 URL: https://issues.apache.org/jira/browse/HDFS-3582 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 2.0.1-alpha Attachments: hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt Occasionally the tests fail with java.util.concurrent.ExecutionException: org.apache.maven.surefire.booter.SurefireBooterForkException: Error occurred in starting fork, check output in log because the NN is exit'ing (via System#exit or Runtime#exit). Unfortunately Surefire doesn't retain the log output (see SUREFIRE-871) so the test log is empty, we don't know which part of the test triggered which exit in HDFS. To make this easier to debug let's hook all daemon process exits when running the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3639) JspHelper#getUGI should always verify the token if security is enabled
[ https://issues.apache.org/jira/browse/HDFS-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412716#comment-13412716 ] Hudson commented on HDFS-3639: -- Integrated in Hadoop-Hdfs-trunk #1101 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1101/]) HDFS-3639. JspHelper#getUGI should always verify the token if security is enabled. Contributed by Eli Collins (Revision 1360485) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360485 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java JspHelper#getUGI should always verify the token if security is enabled -- Key: HDFS-3639 URL: https://issues.apache.org/jira/browse/HDFS-3639 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 1.2.0, 2.0.1-alpha Attachments: hdfs-3639-b1.txt, hdfs-3639.txt JspHelper#getUGI on verifies the given token if the context and nn are set (added in HDFS-2416). We should unconditionally verifyToken the token, ie a bug where name.node is not set in the context object should not result in not verifying the token. In practice this shouldn't be an issue as per HDFS-3434 the context and NN should never be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances
Kihwal Lee created HDFS-3646: Summary: LeaseRenewer can hold reference to inactive DFSClient instances Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 2.0.0-alpha, 0.23.3 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-3646: - Summary: LeaseRenewer can hold reference to inactive DFSClient instances forever (was: LeaseRenewer can hold reference to inactive DFSClient instances) LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3615) Two BlockTokenSecretManager findbugs warnings
[ https://issues.apache.org/jira/browse/HDFS-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412805#comment-13412805 ] Hudson commented on HDFS-3615: -- Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HDFS-3615. Two BlockTokenSecretManager findbugs warnings. Contributed by Aaron T. Myers. (Revision 1360255) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360255 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java Two BlockTokenSecretManager findbugs warnings - Key: HDFS-3615 URL: https://issues.apache.org/jira/browse/HDFS-3615 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Aaron T. Myers Fix For: 2.0.1-alpha Attachments: HDFS-3615.patch Looks like two findbugs warnings were introduced recently (see these across a couple recent patches). Unclear what change introduced it as the file hasn't been modified and recent committed changes pass the findbugs check. ISInconsistent synchronization of org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.keyUpdateInterval; locked 75% of time ISInconsistent synchronization of org.apache.hadoop.hdfs.security.token.block.BlockTokenSecretManager.serialNo; locked 75% of time -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3582) Hook daemon process exit for testing
[ https://issues.apache.org/jira/browse/HDFS-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412804#comment-13412804 ] Hudson commented on HDFS-3582: -- Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HDFS-3582. Hook daemon process exit for testing. Contributed by Eli Collins (Revision 1360329) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360329 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ExitUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/contrib/bkjournal/TestBookKeeperAsHASharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/bkjournal/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogTestUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalSet.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestClusterId.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogJournalFailures.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureOfSharedDir.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestFailureToReadEdits.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHASafeMode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyIsHot.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStateTransitionFailure.java Hook daemon process exit for testing - Key: HDFS-3582 URL: https://issues.apache.org/jira/browse/HDFS-3582 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 2.0.1-alpha Attachments: hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt, hdfs-3582.txt Occasionally the tests fail with java.util.concurrent.ExecutionException: org.apache.maven.surefire.booter.SurefireBooterForkException: Error occurred in starting fork, check output in log because the NN is exit'ing (via System#exit or Runtime#exit). Unfortunately Surefire doesn't retain the log output (see SUREFIRE-871) so the test log is empty, we don't know which part of the test triggered which exit in HDFS. To make this easier to debug let's hook all daemon process exits when running the tests. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3639) JspHelper#getUGI should always verify the token if security is enabled
[ https://issues.apache.org/jira/browse/HDFS-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412807#comment-13412807 ] Hudson commented on HDFS-3639: -- Integrated in Hadoop-Mapreduce-trunk #1134 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1134/]) HDFS-3639. JspHelper#getUGI should always verify the token if security is enabled. Contributed by Eli Collins (Revision 1360485) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360485 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/JspHelper.java JspHelper#getUGI should always verify the token if security is enabled -- Key: HDFS-3639 URL: https://issues.apache.org/jira/browse/HDFS-3639 Project: Hadoop HDFS Issue Type: Bug Components: security Affects Versions: 1.0.0, 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Fix For: 1.2.0, 2.0.1-alpha Attachments: hdfs-3639-b1.txt, hdfs-3639.txt JspHelper#getUGI on verifies the given token if the context and nn are set (added in HDFS-2416). We should unconditionally verifyToken the token, ie a bug where name.node is not set in the context object should not result in not verifying the token. In practice this shouldn't be an issue as per HDFS-3434 the context and NN should never be null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412815#comment-13412815 ] Uma Maheswara Rao G commented on HDFS-3646: --- Kihwal, Thanks for filing the JIRA. I have seen this. One possible option to fix this issue is: Actually lease renewer required for the opened files. So, while opening the file it can add the renewer if there is no client present in Renewer's list of clients. So, file close can remove the dfsCLinet instance completely if there is no filesBeingWritten with that client. Means that, if there is no open files with a particular DFSClient, then that clinet will not be there with renewer. If the same DFSClient wants to open new file, it will take care of adding client to renewer. How does this sounds to you? LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412820#comment-13412820 ] Kihwal Lee commented on HDFS-3646: -- Thanks Uma. That makes sense. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412821#comment-13412821 ] Daryn Sharp commented on HDFS-3646: --- There will be caveats, such as the leak will still occur if client code doesn't explicitly close all streams. I'm not sure how you can tell if there are no more references since {{DFSClient}} holds references to all open streams. Maybe weak references to the streams could be used? LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3513) HttpFS should cache filesystems
[ https://issues.apache.org/jira/browse/HDFS-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412824#comment-13412824 ] Daryn Sharp commented on HDFS-3513: --- The hive jira has hilighted complexities and possible issues with trying to cache ugis/filesystems. Out of curiosity, have you benchmarked whether the ugi cache provides a significant benefit? HttpFS should cache filesystems --- Key: HDFS-3513 URL: https://issues.apache.org/jira/browse/HDFS-3513 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Alejandro Abdelnur Assignee: Alejandro Abdelnur Attachments: HDFS-3513.patch, HDFS-3513.patch, HDFS-3513.patch HttpFS opens and closes a FileSystem instance against the backend filesystem (typically HDFS) on every request. The FileSystem caching is not used as it does not have expiration/timeout and filesystem instances in there live forever, for long running services like HttpFS this is not a good thing as it would keep connections open to the NN. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412863#comment-13412863 ] Uma Maheswara Rao G commented on HDFS-3646: --- {quote} There will be caveats, such as the leak will still occur if client code doesn't explicitly close all streams. {quote} If client code doesn't close the file, dfsClient object should be there and lease renewal should happen as file is in open state. At that time keeping the reference in LeaseRenewer will not be a leak. Please correct me, if I understood your point wrongly. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412885#comment-13412885 ] Daryn Sharp commented on HDFS-3646: --- I agree with everything you said if a client code is still holding a reference to the stream. Unfortunately accidents do happen and streams don't always get closed. Since {{DFSClient}} has a hard reference to the stream, the lost stream will remain open as long as the client is open. In turn, the lost stream will prevent the lease renewer from removing the client when all other streams are closed. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412893#comment-13412893 ] Uma Maheswara Rao G commented on HDFS-3646: --- {quote} Unfortunately accidents do happen and streams don't always get closed. Since {{DFSClient}} has a hard reference to the stream, the lost stream will remain open as long as the client is open. {quote} IMO, this will a leak from application side, since there is a bug in closing the streams from app. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412901#comment-13412901 ] Kihwal Lee commented on HDFS-3646: -- bq. the lost stream will remain open as long as the client is open. I think Daryn is bringing up the issue because its solution also take care of this jira. If we have a finializer for FileSystem, we could have it call close(), then everything will go away. But short of automatic cleaning, this issue still remains. Currently DFSClient won't get garbage collected even if lost streams are automatically closed. I think we should still fix it, even if we eventually implement automatic clean-up. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3646) LeaseRenewer can hold reference to inactive DFSClient instances forever
[ https://issues.apache.org/jira/browse/HDFS-3646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412916#comment-13412916 ] Daryn Sharp commented on HDFS-3646: --- bq. IMO, this will a leak from application side, since there is a bug in closing the streams from app. Agreed, but it can have pretty severe consequences that aren't easily detected unless explicitly hunting for leaks. It makes me uneasy that an out of scope fs stream can cause a massive leak of heavy objects, threads, and tie up sockets that may exhaust fds and/or memory for long running processes. Emitting an angry log error for lost unclosed streams may be more beneficial. I don't think a finalizer on the fs will work. If I do {{in = path.getFileSystem(conf).open(...)}}, the fs might get garbage collected but we certainly don't want its finalizer to shoot the dfs client that is still holding open a stream. Maybe a finalizer on the dfs client, but in any case, the circular hard references need to be broken somehow. LeaseRenewer can hold reference to inactive DFSClient instances forever --- Key: HDFS-3646 URL: https://issues.apache.org/jira/browse/HDFS-3646 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.23.3, 2.0.0-alpha Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 0.23.3, 2.0.1-alpha, 3.0.0 If {{LeaseRenewer#closeClient()}} is not called, {{LeaseRenewer}} keeps the reference to a {{DFSClient}} instance in {{dfsclients}} forever. This prevents {{DFSClient}}, {{LeaseRenewer}}, conf, etc. from being garbage collected, leading to memory leak. {{LeaseRenewer}} should remove the reference after some delay, if a {{DFSClient}} instance no longer has active streams. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas resolved HDFS-3644. --- Resolution: Won't Fix This jira is not necessary. The conflict code was only in 0.20.203. Post upgrade to later releases the conflicting opcode is not used. I am closing than as Won't Fix. Reopen if you disagree. OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3645) Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write
[ https://issues.apache.org/jira/browse/HDFS-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412929#comment-13412929 ] Suresh Srinivas commented on HDFS-3645: --- bq. a more realistic measure of if a DN is really busy by itself Can you elaborate on what this means. Without comparing it with other DNs available in the cluster, a local state of DN is incomplete, no? Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write --- Key: HDFS-3645 URL: https://issues.apache.org/jira/browse/HDFS-3645 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Right now, I think we do too naive a computation for detecting if a chosen DN target is busy by itself. We currently do {{node.getXceiverCount() (2.0 * avgLoad)}}. We should improve on this computation with a more realistic measure of if a DN is really busy by itself or not (rather than checking against cluster average, where there's a good chance the value can be wrong to compare with, for some cases) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412946#comment-13412946 ] Suresh Srinivas commented on HDFS-3644: --- BTW a comment to relevant to my previous comment - https://issues.apache.org/jira/browse/HDFS-1842?focusedCommentId=13021839page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13021839 OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2802) Support for RW/RO snapshots in HDFS
[ https://issues.apache.org/jira/browse/HDFS-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13412969#comment-13412969 ] Hari Mankude commented on HDFS-2802: A quick user's guide hadoop dfsadmin -createsnap snapname path where snap is to be taken ro/rw will create a snap with snapname at the location mentioned hadoop dfsadmin -removesnap snapname will remove snapshot hadoop dfsadmin -listsnap / will list all snaps that have been taken under / Support for RW/RO snapshots in HDFS --- Key: HDFS-2802 URL: https://issues.apache.org/jira/browse/HDFS-2802 Project: Hadoop HDFS Issue Type: New Feature Components: data-node, name-node Affects Versions: 0.24.0 Reporter: Hari Mankude Assignee: Hari Mankude Attachments: snap.patch, snapshot-one-pager.pdf Snapshots are point in time images of parts of the filesystem or the entire filesystem. Snapshots can be a read-only or a read-write point in time copy of the filesystem. There are several use cases for snapshots in HDFS. I will post a detailed write-up soon with with more information. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon reopened HDFS-3644: --- I disagree. There are people running systems with LV -19 which has the conflicted opcodes. Currently if you run OEV on these logs, you end up getting errors because it reads delegation token ops as eg symlink ops. If we don't support OEVing a given LV, we should raise an error. OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3645) Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write
[ https://issues.apache.org/jira/browse/HDFS-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413034#comment-13413034 ] Harsh J commented on HDFS-3645: --- Hi Suresh, Thank you, your question got me thinking some more. I filed this JIRA as a thought dump from some thoughts I was having, going through the policy impl. at present. Sorry for lack of clarification. Let me explain the case I imagine may exist with this specific check: # node.getXceiverCount() is a total 'socket' count. It includes writes, _and_ reads. # Consider a cluster situation such as this when computing the average (may sound a little hypothetical in this explanation but a near enough case is possible in some situations): 100 DNs are present. Average is about 250 but there are possibly some (very few) nodes with much higher xceiver counts, at about 600-800. A likely possibility for such a state is that these nodes are probably serving a very hot, local-block region (a bad HBase case, but quite plausible). # Now consider that this DN wanted to get a block allocated to it. We computed xceiver average, and found it to be, 250, and then we checked node count, it was 700. 700 250 leads to it not getting selected, due to us ignoring the fact that most of the 700 were actually reads and not writes. Perhaps it may have been OK to do a write in this case, if we knew the ratio of reads:writes aside of count(reads+writes) on the DN? I've not seen any major issues with this way of write selection at all, but it does seem to expose a certain edge case. Do you think we should account for such a scenario, or let it be as-is and continue to keep the load count aggregated? If not, let us close this out. Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write --- Key: HDFS-3645 URL: https://issues.apache.org/jira/browse/HDFS-3645 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Right now, I think we do too naive a computation for detecting if a chosen DN target is busy by itself. We currently do {{node.getXceiverCount() (2.0 * avgLoad)}}. We should improve on this computation with a more realistic measure of if a DN is really busy by itself or not (rather than checking against cluster average, where there's a good chance the value can be wrong to compare with, for some cases) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413036#comment-13413036 ] Suresh Srinivas commented on HDFS-3644: --- Todd, can you tell me which apache release the LV -19 is from. Saves me time, since you have already done this analysis. OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413053#comment-13413053 ] Eli Collins commented on HDFS-3644: --- Suresh, {code} hadoop-branch-1 $ grep -r LAYOUT_VERSIONS_203 src/ src/hdfs/org/apache/hadoop/hdfs/server/common/Storage.java: public static final int[] LAYOUT_VERSIONS_203 = {-19, -31}; {code} OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413057#comment-13413057 ] Suresh Srinivas commented on HDFS-3644: --- @Eli, not sure if you saw my previous comment: bq. The conflict code was only in 0.20.203. Post upgrade to later releases the conflicting opcode is not used. Given that a tool that works with the opcodes seems unnecessary, since the problem is only in 0.20.203 alone. Even the editlog code does not handle these conflicts in 0.20.204. We make users save namespace to work around it. OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3647) Expose dfs.datanode.max.xcievers as metric
Steve Hoffman created HDFS-3647: --- Summary: Expose dfs.datanode.max.xcievers as metric Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here (http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3647) Expose current xcievers count as metric
[ https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Hoffman updated HDFS-3647: Summary: Expose current xcievers count as metric (was: Expose dfs.datanode.max.xcievers as metric) Expose current xcievers count as metric --- Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here (http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3563) Fix findbug warnings in raid
[ https://issues.apache.org/jira/browse/HDFS-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413065#comment-13413065 ] Eli Collins commented on HDFS-3563: --- Hi Weiyan, What's the ETA on this? These are causing jenkins to -1 other changes like HDFS-3641 that update the raid code. Also per Jason's comment on MR-3868 TestRaidNode consistently fails. I filed HDFS-3648 for this. Thanks, Eli Fix findbug warnings in raid Key: HDFS-3563 URL: https://issues.apache.org/jira/browse/HDFS-3563 Project: Hadoop HDFS Issue Type: Bug Components: contrib/raid Affects Versions: 3.0.0 Reporter: Jason Lowe Assignee: Weiyan Wang MAPREDUCE-3868 re-enabled raid but introduced 31 new findbugs warnings. Those warnings should be fixed or appropriate items placed in an exclude file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3647) Expose current xcievers count as metric
[ https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413069#comment-13413069 ] Harsh J commented on HDFS-3647: --- I believe I've already done this for Hadoop 0.23+ (Now 2.x), via HDFS-2868. Perhaps we can backport that onto 1.x as well, for which we can re-purpose this JIRA. For CDH requests though, this is the wrong place and the right open channel to use is https://issues.cloudera.org/browse/DISTRO or its mailing lists. Expose current xcievers count as metric --- Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3641) Move server Util time methods to common and use now instead of System#currentTimeMillis
[ https://issues.apache.org/jira/browse/HDFS-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413068#comment-13413068 ] Eli Collins commented on HDFS-3641: --- findbugs failures are from hdfs-raid, HDFS-3563 tracks those. Per MAPREDUCE-3868 this has been failing since hdfs-raid was re-introduced. I filed HDFS-3648 to track this. Move server Util time methods to common and use now instead of System#currentTimeMillis --- Key: HDFS-3641 URL: https://issues.apache.org/jira/browse/HDFS-3641 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Eli Collins Priority: Minor Attachments: hdfs-3641.txt To help HDFS-3640, let's move the time methods from the HDFS server Util class to common and use now instead of System#currentTimeMillis. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-385) Design a pluggable interface to place replicas of blocks in HDFS
[ https://issues.apache.org/jira/browse/HDFS-385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumadhur Reddy Bolli updated HDFS-385: -- Attachment: blockplacementpolicy-branch-1-win.patch blockplacementpolicy-branch-1.patch Patches to port the pluggable interface to branch-1 and branch-1-win. Design a pluggable interface to place replicas of blocks in HDFS Key: HDFS-385 URL: https://issues.apache.org/jira/browse/HDFS-385 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.21.0 Attachments: BlockPlacementPluggable.txt, BlockPlacementPluggable2.txt, BlockPlacementPluggable3.txt, BlockPlacementPluggable4.txt, BlockPlacementPluggable4.txt, BlockPlacementPluggable5.txt, BlockPlacementPluggable6.txt, BlockPlacementPluggable7.txt, blockplacementpolicy-branch-1-win.patch, blockplacementpolicy-branch-1.patch The current HDFS code typically places one replica on local rack, the second replica on remote random rack and the third replica on a random node of that remote rack. This algorithm is baked in the NameNode's code. It would be nice to make the block placement algorithm a pluggable interface. This will allow experimentation of different placement algorithms based on workloads, availability guarantees and failure models. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3647) Expose current xcievers count as metric
[ https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413075#comment-13413075 ] Steve Hoffman commented on HDFS-3647: - https://issues.cloudera.org/browse/DISTRO-414 opened with cloudera. Thx. I'll leave it to you guys if you want to use this to track a 1.X apache back port. Expose current xcievers count as metric --- Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3641) Move server Util time methods to common and use now instead of System#currentTimeMillis
[ https://issues.apache.org/jira/browse/HDFS-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413095#comment-13413095 ] Hudson commented on HDFS-3641: -- Integrated in Hadoop-Hdfs-trunk-Commit #2523 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2523/]) HDFS-3641. Move server Util time methods to common and use now instead of System#currentTimeMillis. Contributed by Eli Collins (Revision 1360858) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360858 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/WritableRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsRecordBuilderImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSinkAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/AsyncDiskService.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ThreadUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestReconfiguration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestTrash.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/loadGenerator/LoadGenerator.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/s3native/InMemoryNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ActiveStandbyElectorTestUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestHealthMonitor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverControllerStress.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/Timer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/MiniRPCBenchmark.java *
[jira] [Commented] (HDFS-3641) Move server Util time methods to common and use now instead of System#currentTimeMillis
[ https://issues.apache.org/jira/browse/HDFS-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413097#comment-13413097 ] Hudson commented on HDFS-3641: -- Integrated in Hadoop-Common-trunk-Commit #2457 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2457/]) HDFS-3641. Move server Util time methods to common and use now instead of System#currentTimeMillis. Contributed by Eli Collins (Revision 1360858) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360858 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/WritableRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsRecordBuilderImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSinkAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/AsyncDiskService.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ThreadUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestReconfiguration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestTrash.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/loadGenerator/LoadGenerator.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/s3native/InMemoryNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ActiveStandbyElectorTestUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestHealthMonitor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverControllerStress.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/Timer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ipc/MiniRPCBenchmark.java *
[jira] [Resolved] (HDFS-3648) TestRaidNode.testDistRaid fails
[ https://issues.apache.org/jira/browse/HDFS-3648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe resolved HDFS-3648. -- Resolution: Duplicate TestRaidNode.testDistRaid fails --- Key: HDFS-3648 URL: https://issues.apache.org/jira/browse/HDFS-3648 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Eli Collins Per MAPREDUCE-3868 TestRaidNode fails consistently, here's a recent example from HDFS-3641. Error Message expected:0 but was:2 Stacktrace junit.framework.AssertionFailedError: expected:0 but was:2 at junit.framework.Assert.fail(Assert.java:47) at junit.framework.Assert.failNotEquals(Assert.java:283) at junit.framework.Assert.assertEquals(Assert.java:64) at junit.framework.Assert.assertEquals(Assert.java:130) at junit.framework.Assert.assertEquals(Assert.java:136) at org.apache.hadoop.raid.TestRaidNode.testDistRaid(TestRaidNode.java:583) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3641) Move server Util time methods to common and use now instead of System#currentTimeMillis
[ https://issues.apache.org/jira/browse/HDFS-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413137#comment-13413137 ] Hudson commented on HDFS-3641: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2476 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2476/]) HDFS-3641. Move server Util time methods to common and use now instead of System#currentTimeMillis. Contributed by Eli Collins (Revision 1360858) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360858 Files : * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/DelegationTokenRenewer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/TrashPolicyDefault.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFs.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/SequenceFile.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/ProtobufRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/RPC.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/WritableRpcEngine.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsRecordBuilderImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSinkAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSourceAdapter.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/metrics2/impl/MetricsSystemImpl.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/net/SocketIOWithTimeout.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/Groups.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/AsyncDiskService.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shell.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ThreadUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Time.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/TestReconfiguration.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestTrash.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/loadGenerator/LoadGenerator.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/s3native/InMemoryNativeFileSystemStore.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ActiveStandbyElectorTestUtil.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/ClientBaseWithFixes.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestHealthMonitor.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverController.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/ha/TestZKFailoverControllerStress.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/TestTFileSeqFileComparison.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/file/tfile/Timer.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/nativeio/TestNativeIO.java *
[jira] [Created] (HDFS-3649) Port HDFS-385 to branch-1-win
Sumadhur Reddy Bolli created HDFS-3649: -- Summary: Port HDFS-385 to branch-1-win Key: HDFS-3649 URL: https://issues.apache.org/jira/browse/HDFS-3649 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 1-win Reporter: Sumadhur Reddy Bolli Added patch to HDF-385 to port the existing pluggable placement policy to branch-1-win -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3564) Make the replication policy pluggable to allow custom replication policies
[ https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3564: -- Target Version/s: (was: 1.1.0) Hi Sumadhur, I'm unsetting the target version from 1.1.0 since that release is already under way. Btw branch-1 is our sustaining branch, will need to be sure to make sure this is compatible / well tested. Make the replication policy pluggable to allow custom replication policies -- Key: HDFS-3564 URL: https://issues.apache.org/jira/browse/HDFS-3564 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli Original Estimate: 24h Remaining Estimate: 24h ReplicationTargetChooser currently determines the placement of replicas in hadoop. Making the replication policy pluggable would help in having custom replication policies that suit the environment. Eg1: Enabling placing replicas across different datacenters(not just racks) Eg2: Enabling placing replicas across multiple(more than 2) racks Eg3: Cloud environments like azure have logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. Custom policies to address issues like these can be written if we make the policy pluggable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1
[ https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3647: -- Target Version/s: 1.2.0 Summary: Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1 (was: Expose current xcievers count as metric) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1 - Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3566) Custom Replication Policy for Azure
[ https://issues.apache.org/jira/browse/HDFS-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sumadhur Reddy Bolli updated HDFS-3566: --- Target Version/s: 1-win (was: 1.1.0) Custom Replication Policy for Azure --- Key: HDFS-3566 URL: https://issues.apache.org/jira/browse/HDFS-3566 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli Azure has logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. This policy would spread the datanodes across atleast 2 fault domains and three upgrade domains to prevent data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3564) Make the replication policy pluggable to allow custom replication policies
[ https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413155#comment-13413155 ] Harsh J commented on HDFS-3564: --- bq. I will re-purpose this JIRA to suggest enhancements to the existing abstraction. Given that HDFS-3649 was just opened for backport work, can you at least re-title the JIRA to fit this re-purpose goal? Avoids confusion for some of us. Thanks! :) Make the replication policy pluggable to allow custom replication policies -- Key: HDFS-3564 URL: https://issues.apache.org/jira/browse/HDFS-3564 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli Original Estimate: 24h Remaining Estimate: 24h ReplicationTargetChooser currently determines the placement of replicas in hadoop. Making the replication policy pluggable would help in having custom replication policies that suit the environment. Eg1: Enabling placing replicas across different datacenters(not just racks) Eg2: Enabling placing replicas across multiple(more than 2) racks Eg3: Cloud environments like azure have logical concepts like fault and upgrade domains. Each fault domain spans multiple upgrade domains and each upgrade domain spans multiple fault domains. Machines are spread typically evenly across both fault and upgrade domains. Fault domain failures are typically catastrophic/unplanned failures and data loss possibility is high. An upgrade domain can be taken down by azure for maintenance periodically. Each time an upgrade domain is taken down a small percentage of machines in the upgrade domain(typically 1-2%) are replaced due to disk failures, thus losing data. Assuming the default replication factor 3, any 3 data nodes going down at the same time would mean potential data loss. So, it is important to have a policy that spreads replicas across both fault and upgrade domains to ensure practically no data loss. The problem here is two dimensional and the default policy in hadoop is one-dimensional. Custom policies to address issues like these can be written if we make the policy pluggable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-3647) Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1
[ https://issues.apache.org/jira/browse/HDFS-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reassigned HDFS-3647: - Assignee: Harsh J Backport HDFS-2868 (Add number of active transfer threads to the DataNode status) to branch-1 - Key: HDFS-3647 URL: https://issues.apache.org/jira/browse/HDFS-3647 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.20.2 Reporter: Steve Hoffman Assignee: Harsh J Not sure if this is in a newer version of Hadoop, but in CDH3u3 it isn't there. There is a lot of mystery surrounding how large to set dfs.datanode.max.xcievers. Most people say to just up it to 4096, but given that exceeding this will cause an HBase RegionServer shutdown (see Lars' blog post here: http://www.larsgeorge.com/2012/03/hadoop-hbase-and-xceivers.html), it would be nice if we could expose the current count via the built-in metrics framework (most likely under dfs). In this way we could watch it to see if we have it set too high, too low, time to bump it up, etc. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3644) OEV should recognize and deal with 0.20.20x opcode versions
[ https://issues.apache.org/jira/browse/HDFS-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413179#comment-13413179 ] Suresh Srinivas commented on HDFS-3644: --- bq. That's the same proposal here Sorry, that was not clear to me by the title or description. Perhaps we could change them for better clarity. OEV should recognize and deal with 0.20.20x opcode versions --- Key: HDFS-3644 URL: https://issues.apache.org/jira/browse/HDFS-3644 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 2.0.0-alpha, 3.0.0 Reporter: Todd Lipcon Priority: Minor We have some opcode conflicts for edit logs between 0.20.20x (LV -19, -31) vs newer versions. For edit log loading, we dealt with this by forcing users to save namespace on an earlier version before upgrading. But, using a trunk OEV on an older version is useful since the OEV has had so many improvements. It would be nice to be able to specify a flag to the OEV to be able to run on older edit logs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3564) Design enhancements to the pluggable blockplacementpolicy
[ https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413183#comment-13413183 ] Sumadhur Reddy Bolli commented on HDFS-3564: I apologize for the incovenience. Changed the title. I will update the description or attach a doc with the proposed changes once the 3649 port is complete. Thanks! Design enhancements to the pluggable blockplacementpolicy - Key: HDFS-3564 URL: https://issues.apache.org/jira/browse/HDFS-3564 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3564) Design enhancements to the pluggable blockplacementpolicy
[ https://issues.apache.org/jira/browse/HDFS-3564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413184#comment-13413184 ] Suresh Srinivas commented on HDFS-3564: --- bq. will need to be sure to make sure this is compatible / well tested Eli, not sure about compatible requirements. I think block placement policy was made InterfaceAudience.Private sometime back. It referred to internal classes that were not public. That said, I agree, any enhancement should try to preserve the compatibility. Design enhancements to the pluggable blockplacementpolicy - Key: HDFS-3564 URL: https://issues.apache.org/jira/browse/HDFS-3564 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: Sumadhur Reddy Bolli -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2465) Add HDFS support for fadvise readahead and drop-behind
[ https://issues.apache.org/jira/browse/HDFS-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413190#comment-13413190 ] Suresh Srinivas commented on HDFS-2465: --- This time, going from +! to +1 :-) Add HDFS support for fadvise readahead and drop-behind -- Key: HDFS-2465 URL: https://issues.apache.org/jira/browse/HDFS-2465 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0 Attachments: HDFS-2465.branch-1.patch, hdfs-2465.txt, hdfs-2465.txt, hdfs-2465.txt, hdfs-2465.txt This is the HDFS side of HADOOP-7714. The initial implementation is heuristic based and should be considered experimental, as discussed in the parent JIRA. It should be off by default until better heuristics, APIs, and tuning experience is developed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2465) Add HDFS support for fadvise readahead and drop-behind
[ https://issues.apache.org/jira/browse/HDFS-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-2465: -- Fix Version/s: 1.2.0 I committed the patch to branch-1 Add HDFS support for fadvise readahead and drop-behind -- Key: HDFS-2465 URL: https://issues.apache.org/jira/browse/HDFS-2465 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, performance Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: 0.23.0, 1.2.0 Attachments: HDFS-2465.branch-1.patch, hdfs-2465.txt, hdfs-2465.txt, hdfs-2465.txt, hdfs-2465.txt This is the HDFS side of HADOOP-7714. The initial implementation is heuristic based and should be considered experimental, as discussed in the parent JIRA. It should be off by default until better heuristics, APIs, and tuning experience is developed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3583) Convert remaining tests to Junit4
[ https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Wang updated HDFS-3583: -- Attachment: hdfs-3583.patch So I think this is getting about done. The findbugs warnings are all in hdfs-raid, which I didn't touch. The huge diff is because the last Jenkins job didn't findbugs hdfs-raid. I blacklisted TestNameNodeMXBean, which should fix it. TestBackupNode and TestRaidNode failed for me on trunk. TestDirectoryScanner worked for me locally. I also manually verified that the # of tests between the last Jenkins run and another recent PreCommit job was the same. Convert remaining tests to Junit4 - Key: HDFS-3583 URL: https://issues.apache.org/jira/browse/HDFS-3583 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Andrew Wang Labels: newbie Attachments: hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's convert the remaining tests over to Junit4 style. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3645) Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write
[ https://issues.apache.org/jira/browse/HDFS-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413251#comment-13413251 ] Suresh Srinivas commented on HDFS-3645: --- I was just trying to understand the proposal, especially the part where you say rather than checking against cluster average. Current code is trying to distribute the load among datanodes. It considers both reads and writes as the same cost to datanodes. Perhaps this is not good enough and may need further improvements. Given block placement is pluggable, other policies could be tried out. Given that I am not sure if the jira or the title makes sense. In order to try out other policies, one may also add more granular stats - such as number of reader and writers, number of readers or writers per disk etc. Given that I am not sure if the title or the description is clear enough. But we could keep the jira around for such discussions. Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write --- Key: HDFS-3645 URL: https://issues.apache.org/jira/browse/HDFS-3645 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor Right now, I think we do too naive a computation for detecting if a chosen DN target is busy by itself. We currently do {{node.getXceiverCount() (2.0 * avgLoad)}}. We should improve on this computation with a more realistic measure of if a DN is really busy by itself or not (rather than checking against cluster average, where there's a good chance the value can be wrong to compare with, for some cases) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3610) fuse_dfs: Provide a way to use the default (configured) NN URI
[ https://issues.apache.org/jira/browse/HDFS-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3610: --- Attachment: HDFS-3610.001.patch This patch depends on HDFS-3609. It makes it possible to mount arbitrary URI strings in fuse_dfs. All the existing arguments that worked previously are still supported. Notably, the quirky 'dfs://' as a synonym for 'hdfs://' behavior is still preserved, and you can specify port via -o server=hdfs://hostname:port or -o server=hdfs://hostname -oport=port fuse_dfs: Provide a way to use the default (configured) NN URI -- Key: HDFS-3610 URL: https://issues.apache.org/jira/browse/HDFS-3610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3610.001.patch It shouldn't be necessary to explictly spell out the NameNode you want to connect to when launching fuse_dfs. libhdfs can read the configuration files and use the default URI. However, we don't have a command-line option for this in fuse_dfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3610) fuse_dfs: Provide a way to use the default (configured) NN URI
[ https://issues.apache.org/jira/browse/HDFS-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3610: --- Status: Patch Available (was: Open) fuse_dfs: Provide a way to use the default (configured) NN URI -- Key: HDFS-3610 URL: https://issues.apache.org/jira/browse/HDFS-3610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3610.001.patch It shouldn't be necessary to explictly spell out the NameNode you want to connect to when launching fuse_dfs. libhdfs can read the configuration files and use the default URI. However, we don't have a command-line option for this in fuse_dfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations
Andrew Wang created HDFS-3650: - Summary: Use MutableQuantiles to provide latency histograms for various operations Key: HDFS-3650 URL: https://issues.apache.org/jira/browse/HDFS-3650 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang Fix For: 2.0.1-alpha MutableQuantiles provide accurate estimation of various percentiles for a stream of data. Many existing metrics reported by a MutableRate would also benefit from having these percentiles; lets add MutableQuantiles where we think it'd be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3630) Modify TestPersistBlocks to use both flush and hflush
[ https://issues.apache.org/jira/browse/HDFS-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sanjay Radia updated HDFS-3630: --- Resolution: Fixed Target Version/s: 3.0.0 Status: Resolved (was: Patch Available) Modify TestPersistBlocks to use both flush and hflush - Key: HDFS-3630 URL: https://issues.apache.org/jira/browse/HDFS-3630 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sanjay Radia Assignee: Sanjay Radia Attachments: hdfs3630.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3650) Use MutableQuantiles to provide latency histograms for various operations
[ https://issues.apache.org/jira/browse/HDFS-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-3650: - Target Version/s: 2.0.1-alpha Fix Version/s: (was: 2.0.1-alpha) Setting the target version instead of the fix version. Please only set the fix version once it's been committed. Use MutableQuantiles to provide latency histograms for various operations - Key: HDFS-3650 URL: https://issues.apache.org/jira/browse/HDFS-3650 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Andrew Wang Assignee: Andrew Wang MutableQuantiles provide accurate estimation of various percentiles for a stream of data. Many existing metrics reported by a MutableRate would also benefit from having these percentiles; lets add MutableQuantiles where we think it'd be useful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3651) optionally, the NameNode should invoke saveNamespace after getting a SIGTERM
Colin Patrick McCabe created HDFS-3651: -- Summary: optionally, the NameNode should invoke saveNamespace after getting a SIGTERM Key: HDFS-3651 URL: https://issues.apache.org/jira/browse/HDFS-3651 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor It would be nice if the NameNode could be configured so that it did a saveNamespace and then shut down cleanly after receiving a SIGTERM signal. In general, it is a good practice to call saveNamespace when doing an orderly shutdown, to ensure that all of the information in the namespace is on disk. Of course, this should not be necessary if the SecondaryNameNode or StandbyNameNode is operating correctly. However, when there are bugs in these daemons, a saveNamespace can prevent disaster. Currently, we don't catch SIGTERM, but just shut down immediately, without doing any cleanup. Of course, it will always be possible to shut down the NameNode without doing a saveNamespace, simply by sending SIGKILL, which is un-catchable. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3630) Modify TestPersistBlocks to use both flush and hflush
[ https://issues.apache.org/jira/browse/HDFS-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413313#comment-13413313 ] Hudson commented on HDFS-3630: -- Integrated in Hadoop-Hdfs-trunk-Commit #2524 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2524/]) HDFS-3630 Modify TestPersistBlocks to use both flush and hflush (sanjay) (Revision 1360991) Result = SUCCESS sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360991 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java Modify TestPersistBlocks to use both flush and hflush - Key: HDFS-3630 URL: https://issues.apache.org/jira/browse/HDFS-3630 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sanjay Radia Assignee: Sanjay Radia Attachments: hdfs3630.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3630) Modify TestPersistBlocks to use both flush and hflush
[ https://issues.apache.org/jira/browse/HDFS-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413315#comment-13413315 ] Hudson commented on HDFS-3630: -- Integrated in Hadoop-Common-trunk-Commit #2458 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2458/]) HDFS-3630 Modify TestPersistBlocks to use both flush and hflush (sanjay) (Revision 1360991) Result = SUCCESS sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360991 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java Modify TestPersistBlocks to use both flush and hflush - Key: HDFS-3630 URL: https://issues.apache.org/jira/browse/HDFS-3630 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sanjay Radia Assignee: Sanjay Radia Attachments: hdfs3630.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
Todd Lipcon created HDFS-3652: - Summary: 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name Key: HDFS-3652 URL: https://issues.apache.org/jira/browse/HDFS-3652 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.3, 1.1.0, 1.2.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition: {code} File parentDir = getStorageDirForStream(idx); if (parentDir.getName().equals(sd.getRoot().getName())) { {code} ... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong stream(s) to remove. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
[ https://issues.apache.org/jira/browse/HDFS-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413320#comment-13413320 ] Matt Foley commented on HDFS-3652: -- Urk! Quite a catch. When patch available, please commit to branch-1.0 as well as branch-1.1 and branch-1. 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name - Key: HDFS-3652 URL: https://issues.apache.org/jira/browse/HDFS-3652 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.3, 1.1.0, 1.2.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition: {code} File parentDir = getStorageDirForStream(idx); if (parentDir.getName().equals(sd.getRoot().getName())) { {code} ... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong stream(s) to remove. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
[ https://issues.apache.org/jira/browse/HDFS-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413326#comment-13413326 ] Todd Lipcon commented on HDFS-3652: --- This has data-loss implications as well. I am able to reproduce the following: - NN is writing to three dirs: /data/1/nn, /data/2/nn, and /data/3/nn - I modified the NN to inject an IOException when creating edits.new in /data/3/nn, which causes removeEditsForStorageDir to get called inside {{rollEditLog}} - Upon triggering a checkpoint: -- all three logs are closed successfully -- /data/1/nn and /data/2/nn are successfully opened for edits.new -- /data/3/nn throws an IOE which gets caught. This calls {{removeEditsForStorageDir}}, which removes the wrong stream (augmented logging): {code} 12/07/12 16:23:54 INFO namenode.FSNamesystem: Roll Edit Log from 127.0.0.1 12/07/12 16:23:54 INFO namenode.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0 0 0 12/07/12 16:23:54 WARN namenode.FSNamesystem: Removing edits stream /tmp/name1/nn/current/edits.new 12/07/12 16:23:54 WARN common.Storage: Removing storage dir /tmp/name3/nn java.io.IOException: Injected fault for /tmp/name3/nn/current/edits.new at org.apache.hadoop.hdfs.server.namenode.FSEditLog$EditLogFileOutputStream.init(FSEditLog.java:146) {code} - The NN is now _only_ writing to /tmp/name2/nn/current/edits.new, but considers both name1 and name2 to be good from a storage-directory standpoint. However, {{/tmp/name1/nn/current/edits.new}} exists as an empty edit log file (just the header and preallocated region of 0xffs) - When {{rollFSImage}} is called, it successfully calls {{close}} only on the name2 log - which truncates it to the correct transaction boundary. Then it renames both {{name2/.../edits.new}} and {{name1/.../edits.new}} to {{edits}}, and opens them both for append (assuming they've been truncated to a transaction boundary). - The NN is now writing to name1 and name2, but name1's log looks like this: {code} valid header preallocated bytes of 0x. transactions {code} - Upon the next checkpoint, the 2NN will likely download this log, since it's listed first in the name directory list. Upon doing so, it will see the 0xff at the head of the log and not read any of the edits (which come after all of the 0xffs) - The 2NN then uploads the merged image back to the NN, which blows away the edits file. Thus, its in-memory data has gotten out of sync with the disk data, and the next time a checkpoint occurs or the NN restarts, it will fail. This is not an issue in trunk since the code was largely rewritten by HDFS-1073. The workaround for existing users is simple: rename the directories to eg /data/1/nn1 and /data/2/nn2. The fix is also simple. I will upload the fix this afternoon. 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name - Key: HDFS-3652 URL: https://issues.apache.org/jira/browse/HDFS-3652 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.3, 1.1.0, 1.2.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition: {code} File parentDir = getStorageDirForStream(idx); if (parentDir.getName().equals(sd.getRoot().getName())) { {code} ... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong stream(s) to remove. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3610) fuse_dfs: Provide a way to use the default (configured) NN URI
[ https://issues.apache.org/jira/browse/HDFS-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413331#comment-13413331 ] Hadoop QA commented on HDFS-3610: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536293/HDFS-3610.001.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.common.TestJspHelper +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2808//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2808//console This message is automatically generated. fuse_dfs: Provide a way to use the default (configured) NN URI -- Key: HDFS-3610 URL: https://issues.apache.org/jira/browse/HDFS-3610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3610.001.patch It shouldn't be necessary to explictly spell out the NameNode you want to connect to when launching fuse_dfs. libhdfs can read the configuration files and use the default URI. However, we don't have a command-line option for this in fuse_dfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3630) Modify TestPersistBlocks to use both flush and hflush
[ https://issues.apache.org/jira/browse/HDFS-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1341#comment-1341 ] Hudson commented on HDFS-3630: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2477 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2477/]) HDFS-3630 Modify TestPersistBlocks to use both flush and hflush (sanjay) (Revision 1360991) Result = FAILURE sradia : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1360991 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestPersistBlocks.java Modify TestPersistBlocks to use both flush and hflush - Key: HDFS-3630 URL: https://issues.apache.org/jira/browse/HDFS-3630 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0 Reporter: Sanjay Radia Assignee: Sanjay Radia Attachments: hdfs3630.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3609) libhdfs: don't force the URI to look like hdfs://hostname:port
[ https://issues.apache.org/jira/browse/HDFS-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413334#comment-13413334 ] Eli Collins commented on HDFS-3609: --- Patch looks good Nit: what you're calling URI prefix/location/protocol type in the comments is called the scheme in URI lingo. Testing? Eg confirmed you can run libhdfs against an HA config (ie one w/o a port) now? The test failures here are obviously unrelated. libhdfs: don't force the URI to look like hdfs://hostname:port -- Key: HDFS-3609 URL: https://issues.apache.org/jira/browse/HDFS-3609 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3609.001.patch Currently, libhdfs forces the URI to look like hdfs://hostname:port. For configurations like HA or federation this is not ideal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3633) libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE
[ https://issues.apache.org/jira/browse/HDFS-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3633: -- Resolution: Fixed Fix Version/s: 2.0.1-alpha Target Version/s: (was: 2.0.1-alpha) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) +1 findbugs is unrelated. I've committed this, thanks Colin. libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE - Key: HDFS-3633 URL: https://issues.apache.org/jira/browse/HDFS-3633 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3633.001.patch In libhdfs in hdfsDelete, the header file says any non-zero argument to hdfsDelete will be interpreted as true. However, the hdfsDelete function does not translate these non-zero values to JNI_FALSE and JNI_TRUE, potentially leading to undefined or JVM-specific behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
[ https://issues.apache.org/jira/browse/HDFS-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3652: -- Attachment: hdfs-3652.txt Attached patch is for branch-1. I modified the existing storage dir failure test so that all of the name dirs have the same name, and it started to fail. After fixing the bug, it passes. 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name - Key: HDFS-3652 URL: https://issues.apache.org/jira/browse/HDFS-3652 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.3, 1.1.0, 1.2.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-3652.txt In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition: {code} File parentDir = getStorageDirForStream(idx); if (parentDir.getName().equals(sd.getRoot().getName())) { {code} ... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong stream(s) to remove. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3652) 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
[ https://issues.apache.org/jira/browse/HDFS-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413346#comment-13413346 ] Aaron T. Myers commented on HDFS-3652: -- +1, the patch looks good to me. Great find/fix, Todd. 1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name - Key: HDFS-3652 URL: https://issues.apache.org/jira/browse/HDFS-3652 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.3, 1.1.0, 1.2.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-3652.txt In {{FSEditLog.removeEditsForStorageDir}}, we iterate over the edits streams trying to find the stream corresponding to a given dir. To check equality, we currently use the following condition: {code} File parentDir = getStorageDirForStream(idx); if (parentDir.getName().equals(sd.getRoot().getName())) { {code} ... which is horribly incorrect. If two or more storage dirs happen to have the same terminal path component (eg /data/1/nn and /data/2/nn) then it will pick the wrong stream(s) to remove. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-799) libhdfs must call DetachCurrentThread when a thread is destroyed
[ https://issues.apache.org/jira/browse/HDFS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413350#comment-13413350 ] Eli Collins commented on HDFS-799: -- +1 (test failure is unrelated). libhdfs must call DetachCurrentThread when a thread is destroyed Key: HDFS-799 URL: https://issues.apache.org/jira/browse/HDFS-799 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Christian Kunz Assignee: Colin Patrick McCabe Attachments: HDFS-799.001.patch, HDFS-799.003.patch, HDFS-799.004.patch, HDFS-799.005.patch Threads that call AttachCurrentThread in libhdfs and disappear without calling DetachCurrentThread cause a memory leak. Libhdfs should detach the current thread when this thread exits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-799) libhdfs must call DetachCurrentThread when a thread is destroyed
[ https://issues.apache.org/jira/browse/HDFS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-799: - Resolution: Fixed Fix Version/s: 2.0.1-alpha Target Version/s: (was: 2.0.1-alpha) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this and merged to branch-2. Thanks Colin. libhdfs must call DetachCurrentThread when a thread is destroyed Key: HDFS-799 URL: https://issues.apache.org/jira/browse/HDFS-799 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Christian Kunz Assignee: Colin Patrick McCabe Fix For: 2.0.1-alpha Attachments: HDFS-799.001.patch, HDFS-799.003.patch, HDFS-799.004.patch, HDFS-799.005.patch Threads that call AttachCurrentThread in libhdfs and disappear without calling DetachCurrentThread cause a memory leak. Libhdfs should detach the current thread when this thread exits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3492) fix some misuses of InputStream#skip
[ https://issues.apache.org/jira/browse/HDFS-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3492: -- Attachment: hdfs-3492.txt Patch rebased on trunk. fix some misuses of InputStream#skip Key: HDFS-3492 URL: https://issues.apache.org/jira/browse/HDFS-3492 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3492.001.patch, HDFS-3492.002.patch, hdfs-3492.txt It seems that we have a few cases where programmers are calling InputStream#skip and not handling short skips. Unfortunately, the skip method is documented and implemented so that it doesn't actually skip the requested number of bytes, but simply tries to skip at most that amount of bytes. A better name probably would have been trySkip or similar. It seems like most of the time when the argument to skip is small enough, we'll succeed almost all of the time. This is no doubt an implementation artifact of some of the popular stream implementations. This tends to hide the bug-- however, it is still waiting to emerge at some point if those implementations ever change or if buffer sizes are adjusted, etc. All of these cases can be fixed by calling IOUtils#skipFully to get the behavior that the programmer expects-- i.e., skipping by the specified amount. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3583) Convert remaining tests to Junit4
[ https://issues.apache.org/jira/browse/HDFS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413358#comment-13413358 ] Hadoop QA commented on HDFS-3583: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12536289/hdfs-3583.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 259 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-httpfs hadoop-hdfs-project/hadoop-hdfs-raid: org.apache.hadoop.hdfs.server.namenode.TestBackupNode org.apache.hadoop.hdfs.server.common.TestJspHelper org.apache.hadoop.raid.TestRaidNode +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/2807//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/2807//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-raid.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2807//console This message is automatically generated. Convert remaining tests to Junit4 - Key: HDFS-3583 URL: https://issues.apache.org/jira/browse/HDFS-3583 Project: Hadoop HDFS Issue Type: Improvement Components: test Affects Versions: 2.0.0-alpha Reporter: Eli Collins Assignee: Andrew Wang Labels: newbie Attachments: hdfs-3583.patch, hdfs-3583.patch, junit3to4.sh JUnit4 style tests are easier to debug (eg can use @Timeout etc), let's convert the remaining tests over to Junit4 style. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3653) 1.x: Add a retention period for purged edit logs
Todd Lipcon created HDFS-3653: - Summary: 1.x: Add a retention period for purged edit logs Key: HDFS-3653 URL: https://issues.apache.org/jira/browse/HDFS-3653 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 1.1.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Occasionally we have a bug which causes something to go wrong with edits files. Even more occasionally the bug is such that the namenode mistakenly deletes an {{edits}} file without merging it into {{fsimage}} properly -- e.g if the bug mistakenly writes an OP_INVALID at the top of the log. In trunk/2.0 we retain many edit log segments going back in time to be more robust to this kind of error. I'd like to implement something similar (but much simpler) in 1.x, which would be used only by HDFS developers in root-causing or repairing from these rare scenarios: the NN should never directly delete an edit log file. Instead, it should rename the file into some kind of trash directory inside the name dir, and associate it with a timestamp. Then, periodically a separate thread should scan the trash dirs and delete any logs older than a configurable time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3606) libhdfs: create self-contained unit test
[ https://issues.apache.org/jira/browse/HDFS-3606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413359#comment-13413359 ] Eli Collins commented on HDFS-3606: --- The following should be updated now that HDFS-3633 is in right? {code} // TODO: Non-recursive delete should fail? //EXPECT_NONZERO(hdfsDelete(fs, prefix, 0)); {code} Otherwise looks great. Agree with all of Andy's (excellent) feedback. libhdfs: create self-contained unit test Key: HDFS-3606 URL: https://issues.apache.org/jira/browse/HDFS-3606 Project: Hadoop HDFS Issue Type: Test Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3606.001.patch, HDFS-3606.003.patch, HDFS-3606.004.patch We should have a self-contained unit test for libhdfs and also for FUSE. We do have hdfs_test, but it is not self-contained (it requires a cluster to already be running before it can be used.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3633) libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE
[ https://issues.apache.org/jira/browse/HDFS-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413363#comment-13413363 ] Hudson commented on HDFS-3633: -- Integrated in Hadoop-Common-trunk-Commit #2459 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2459/]) HDFS-3633. libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE. Contributed by Colin Patrick McCabe (Revision 1361005) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361005 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE - Key: HDFS-3633 URL: https://issues.apache.org/jira/browse/HDFS-3633 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3633.001.patch In libhdfs in hdfsDelete, the header file says any non-zero argument to hdfsDelete will be interpreted as true. However, the hdfsDelete function does not translate these non-zero values to JNI_FALSE and JNI_TRUE, potentially leading to undefined or JVM-specific behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-799) libhdfs must call DetachCurrentThread when a thread is destroyed
[ https://issues.apache.org/jira/browse/HDFS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413362#comment-13413362 ] Hudson commented on HDFS-799: - Integrated in Hadoop-Common-trunk-Commit #2459 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2459/]) HDFS-799. libhdfs must call DetachCurrentThread when a thread is destroyed. Contributed by Colin Patrick McCabe (Revision 1361008) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361008 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfsJniHelper.c libhdfs must call DetachCurrentThread when a thread is destroyed Key: HDFS-799 URL: https://issues.apache.org/jira/browse/HDFS-799 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Christian Kunz Assignee: Colin Patrick McCabe Fix For: 2.0.1-alpha Attachments: HDFS-799.001.patch, HDFS-799.003.patch, HDFS-799.004.patch, HDFS-799.005.patch Threads that call AttachCurrentThread in libhdfs and disappear without calling DetachCurrentThread cause a memory leak. Libhdfs should detach the current thread when this thread exits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3633) libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE
[ https://issues.apache.org/jira/browse/HDFS-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413367#comment-13413367 ] Hudson commented on HDFS-3633: -- Integrated in Hadoop-Hdfs-trunk-Commit #2525 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2525/]) HDFS-3633. libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE. Contributed by Colin Patrick McCabe (Revision 1361005) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361005 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE - Key: HDFS-3633 URL: https://issues.apache.org/jira/browse/HDFS-3633 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3633.001.patch In libhdfs in hdfsDelete, the header file says any non-zero argument to hdfsDelete will be interpreted as true. However, the hdfsDelete function does not translate these non-zero values to JNI_FALSE and JNI_TRUE, potentially leading to undefined or JVM-specific behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-799) libhdfs must call DetachCurrentThread when a thread is destroyed
[ https://issues.apache.org/jira/browse/HDFS-799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413366#comment-13413366 ] Hudson commented on HDFS-799: - Integrated in Hadoop-Hdfs-trunk-Commit #2525 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2525/]) HDFS-799. libhdfs must call DetachCurrentThread when a thread is destroyed. Contributed by Colin Patrick McCabe (Revision 1361008) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361008 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/CMakeLists.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/config.h.cmake * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfsJniHelper.c libhdfs must call DetachCurrentThread when a thread is destroyed Key: HDFS-799 URL: https://issues.apache.org/jira/browse/HDFS-799 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.0-alpha Reporter: Christian Kunz Assignee: Colin Patrick McCabe Fix For: 2.0.1-alpha Attachments: HDFS-799.001.patch, HDFS-799.003.patch, HDFS-799.004.patch, HDFS-799.005.patch Threads that call AttachCurrentThread in libhdfs and disappear without calling DetachCurrentThread cause a memory leak. Libhdfs should detach the current thread when this thread exits. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3654) TestJspHelper#testGetUgi may fail
Eli Collins created HDFS-3654: - Summary: TestJspHelper#testGetUgi may fail Key: HDFS-3654 URL: https://issues.apache.org/jira/browse/HDFS-3654 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 2.0.1-alpha Reporter: Eli Collins Assignee: Eli Collins Looks like my recent change in HDFS-3639 can occasionally cause this test to fail. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3655) datenode recoverRbw could hang sometime
Ming Ma created HDFS-3655: - Summary: datenode recoverRbw could hang sometime Key: HDFS-3655 URL: https://issues.apache.org/jira/browse/HDFS-3655 Project: Hadoop HDFS Issue Type: Bug Components: data-node Reporter: Ming Ma Fix For: 0.22.1 This bug seems to apply to 0.22 and hadoop 2.0. I will upload the initial fix done by my colleague Xiaobo Peng shortly ( there is some logistics issue being worked on so that he can upload patch himself later ). recoverRbw try to kill the old writer thread, but it took the lock (FSDataset monitor object) which the old writer thread is waiting on ( for example the call to data.getTmpInputStreams ). DataXceiver for client /10.110.3.43:40193 [Receiving block blk_-3037542385914640638_57111747 client=DFSClient_attempt_201206021424_0001_m_000401_0] daemon prio=10 tid=0x7facf8111800 nid=0x6b64 in Object.wait() [0x7facd1ddb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1186) ■locked 0x0007856c1200 (a org.apache.hadoop.util.Daemon) at java.lang.Thread.join(Thread.java:1239) at org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:158) at org.apache.hadoop.hdfs.server.datanode.FSDataset.recoverRbw(FSDataset.java:1347) ■locked 0x0007838398c0 (a org.apache.hadoop.hdfs.server.datanode.FSDataset) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:119) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlockInternal(DataXceiver.java:391) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:327) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:405) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:344) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:183) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3610) fuse_dfs: Provide a way to use the default (configured) NN URI
[ https://issues.apache.org/jira/browse/HDFS-3610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413372#comment-13413372 ] Eli Collins commented on HDFS-3610: --- Looks good, since this patch contains HDFS-3609 let's get that one checked in then upload a patch here that's just the delta. TestBackupNode failure is unrelated. TestJspHelper is as well, filed HDFS-3654. fuse_dfs: Provide a way to use the default (configured) NN URI -- Key: HDFS-3610 URL: https://issues.apache.org/jira/browse/HDFS-3610 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3610.001.patch It shouldn't be necessary to explictly spell out the NameNode you want to connect to when launching fuse_dfs. libhdfs can read the configuration files and use the default URI. However, we don't have a command-line option for this in fuse_dfs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3371) EditLogFileInputStream: be more careful about closing streams when we're done with them.
[ https://issues.apache.org/jira/browse/HDFS-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3371: -- Status: Open (was: Patch Available) EditLogFileInputStream: be more careful about closing streams when we're done with them. Key: HDFS-3371 URL: https://issues.apache.org/jira/browse/HDFS-3371 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3371.001.patch, HDFS-3371.002.patch EditLogFileInputStream#EditLogFileInputStream should be more careful about closing streams when there is an exception thrown. Also, EditLogFileInputStream#close should close all of the streams we opened in the constructor, not just one of them (although the file-backed one is probably the most important). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3270) run valgrind on fuse-dfs, fix any memory leaks
[ https://issues.apache.org/jira/browse/HDFS-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413377#comment-13413377 ] Eli Collins commented on HDFS-3270: --- Colin, This patch is covered by HDFS-3609, anything left to do here? run valgrind on fuse-dfs, fix any memory leaks -- Key: HDFS-3270 URL: https://issues.apache.org/jira/browse/HDFS-3270 Project: Hadoop HDFS Issue Type: Improvement Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Attachments: HDFS-3270.001.patch, HDFS-3270.002.patch run valgrind on fuse-dfs, fix any memory leaks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3612) Single namenode image directory config warning can be improved
[ https://issues.apache.org/jira/browse/HDFS-3612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson updated HDFS-3612: Attachment: hdfs3612-2.txt bq. double space after period I always type double space after period, but I agree it's irrelevant. Fixed. bq. role string Agreed, nice improvement. Attaching new patch. Single namenode image directory config warning can be improved -- Key: HDFS-3612 URL: https://issues.apache.org/jira/browse/HDFS-3612 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Andy Isaacson Priority: Trivial Labels: newbie Attachments: hdfs3612-2.txt, hdfs3612.txt Currently, if you configure the NameNode to run with just one dfs.namenode.name.dir directory, it prints: {code} 12/07/08 20:00:22 WARN namenode.FSNamesystem: Only one dfs.namenode.name.dir directory configured , beware data loss!{code} We can improve this in a few ways as it is slightly ambiguous: # Fix punctuation spacing, there's always a space after a punctuation mark but never before one. # Perhaps the message is better printed with a reason of why it may cause a scare of data loss. For instance, we can print Detected a single storage directory in dfs.namenode.name.dir configuration. Beware of dataloss due to lack of redundant storage directories or so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3609) libhdfs: don't force the URI to look like hdfs://hostname:port
[ https://issues.apache.org/jira/browse/HDFS-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413381#comment-13413381 ] Colin Patrick McCabe commented on HDFS-3609: I've verified I can connect without an explicit port, using hdfs_test. There will be an opportunity to add more unit tests once HDFS-3606 is in. libhdfs: don't force the URI to look like hdfs://hostname:port -- Key: HDFS-3609 URL: https://issues.apache.org/jira/browse/HDFS-3609 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3609.001.patch Currently, libhdfs forces the URI to look like hdfs://hostname:port. For configurations like HA or federation this is not ideal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3609) libhdfs: don't force the URI to look like hdfs://hostname:port
[ https://issues.apache.org/jira/browse/HDFS-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-3609: --- Attachment: HDFS-3609.002.patch * refer to 'hdfs://' as a 'scheme' rather than a 'prefix' libhdfs: don't force the URI to look like hdfs://hostname:port -- Key: HDFS-3609 URL: https://issues.apache.org/jira/browse/HDFS-3609 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.0-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Attachments: HDFS-3609.001.patch, HDFS-3609.002.patch Currently, libhdfs forces the URI to look like hdfs://hostname:port. For configurations like HA or federation this is not ideal. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3306) fuse_dfs: don't lock release operations
[ https://issues.apache.org/jira/browse/HDFS-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3306: -- Resolution: Fixed Fix Version/s: 2.0.1-alpha Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Per offline conversation Colin tested this multithreaded. +1 I've committed this and merged to branch-2. fuse_dfs: don't lock release operations --- Key: HDFS-3306 URL: https://issues.apache.org/jira/browse/HDFS-3306 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3306.001.patch There's no need to lock release operations in FUSE, because release can only be called once on a fuse_file_info structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3633) libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE
[ https://issues.apache.org/jira/browse/HDFS-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413393#comment-13413393 ] Hudson commented on HDFS-3633: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2478 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2478/]) HDFS-3633. libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE. Contributed by Colin Patrick McCabe (Revision 1361005) Result = FAILURE eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361005 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/native/hdfs.c libhdfs: hdfsDelete should pass JNI_FALSE or JNI_TRUE - Key: HDFS-3633 URL: https://issues.apache.org/jira/browse/HDFS-3633 Project: Hadoop HDFS Issue Type: Bug Components: libhdfs Affects Versions: 2.0.1-alpha Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3633.001.patch In libhdfs in hdfsDelete, the header file says any non-zero argument to hdfsDelete will be interpreted as true. However, the hdfsDelete function does not translate these non-zero values to JNI_FALSE and JNI_TRUE, potentially leading to undefined or JVM-specific behavior. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3306) fuse_dfs: don't lock release operations
[ https://issues.apache.org/jira/browse/HDFS-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413397#comment-13413397 ] Hudson commented on HDFS-3306: -- Integrated in Hadoop-Common-trunk-Commit #2460 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2460/]) HDFS-3306. fuse_dfs: don't lock release operations. Contributed by Colin Patrick McCabe (Revision 1361021) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361021 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/fuse_impls_release.c fuse_dfs: don't lock release operations --- Key: HDFS-3306 URL: https://issues.apache.org/jira/browse/HDFS-3306 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3306.001.patch There's no need to lock release operations in FUSE, because release can only be called once on a fuse_file_info structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3306) fuse_dfs: don't lock release operations
[ https://issues.apache.org/jira/browse/HDFS-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13413400#comment-13413400 ] Hudson commented on HDFS-3306: -- Integrated in Hadoop-Hdfs-trunk-Commit #2526 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2526/]) HDFS-3306. fuse_dfs: don't lock release operations. Contributed by Colin Patrick McCabe (Revision 1361021) Result = SUCCESS eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1361021 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/contrib/fuse-dfs/src/fuse_impls_release.c fuse_dfs: don't lock release operations --- Key: HDFS-3306 URL: https://issues.apache.org/jira/browse/HDFS-3306 Project: Hadoop HDFS Issue Type: Bug Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor Fix For: 2.0.1-alpha Attachments: HDFS-3306.001.patch There's no need to lock release operations in FUSE, because release can only be called once on a fuse_file_info structure. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira