[jira] [Updated] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-4762: - Attachment: HDFS-4762.patch.4 Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696601#comment-13696601 ] Brandon Li commented on HDFS-4762: -- Uploaded new patch to address Nicholas' comments. Thanks! Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696620#comment-13696620 ] Hadoop QA commented on HDFS-4762: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590225/HDFS-4762.patch.4 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 6 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs: org.apache.hadoop.hdfs.nfs.nfs3.TestOffsetRange {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4581//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/4581//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs-nfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4581//console This message is automatically generated. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4946) Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable
[ https://issues.apache.org/jira/browse/HDFS-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Kinley updated HDFS-4946: --- Summary: Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable (was: Allow preferLocalNode to be configurable in BlockPlacementPolicyDefault) Allow preferLocalNode in BlockPlacementPolicyDefault to be configurable --- Key: HDFS-4946 URL: https://issues.apache.org/jira/browse/HDFS-4946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha Reporter: James Kinley Allow preferLocalNode in BlockPlacementPolicyDefault to be disabled in configuration to prevent a client from writing the first replica of every block (i.e. the entire file) to the local DataNode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4946) Allow preferLocalNode to be configurable in BlockPlacementPolicyDefault
James Kinley created HDFS-4946: -- Summary: Allow preferLocalNode to be configurable in BlockPlacementPolicyDefault Key: HDFS-4946 URL: https://issues.apache.org/jira/browse/HDFS-4946 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.0.0-alpha Reporter: James Kinley Allow preferLocalNode in BlockPlacementPolicyDefault to be disabled in configuration to prevent a client from writing the first replica of every block (i.e. the entire file) to the local DataNode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently
[ https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696718#comment-13696718 ] Hudson commented on HDFS-4797: -- Integrated in Hadoop-Yarn-trunk #257 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/257/]) HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() consistently. (Revision 1498202) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockScanInfo does not override equals(..) and hashCode() consistently -- Key: HDFS-4797 URL: https://issues.apache.org/jira/browse/HDFS-4797 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.2.0 Attachments: h4797_20130513b.patch, h4797_20130513.patch In the code below, equals(..) compares lastScanTime but hashCode() is computed using block ID. Therefore, it could have two BlockScanInfo objects which are equal but have two different hash codes. {code} //BlockScanInfo @Override public int hashCode() { return block.hashCode(); } @Override public boolean equals(Object other) { return other instanceof BlockScanInfo compareTo((BlockScanInfo)other) == 0; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4915) Add config to ZKFC to disable fencing
[ https://issues.apache.org/jira/browse/HDFS-4915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696779#comment-13696779 ] Uma Maheswara Rao G commented on HDFS-4915: --- I think this is same as HDFS-3862 right? Add config to ZKFC to disable fencing - Key: HDFS-4915 URL: https://issues.apache.org/jira/browse/HDFS-4915 Project: Hadoop HDFS Issue Type: Bug Components: ha Affects Versions: 3.0.0 Reporter: Todd Lipcon With QuorumJournalManager, it's not important for the ZKFCs to perform any fencing. We currently workaround this by setting the fencer to /bin/true, but the ZKFC still does things like create breadcrumb znodes, etc. It would be simpler to add a config to disable fencing, and then the ZKFC's job would be simpler -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4945) A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS
[ https://issues.apache.org/jira/browse/HDFS-4945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696782#comment-13696782 ] Uma Maheswara Rao G commented on HDFS-4945: --- Most of the Questions Suresh already asked for more clarity on this feature. I have one question to know: {quote} When each fragment has k replicas, the file system can tolerate up to floor(k/2 - 1) faulty NameNodes. {quote} How/where you will manage this fragment details metadata? Regards, Uma A Distributed and Cooperative NameNode Cluster for a Highly-Available HDFS -- Key: HDFS-4945 URL: https://issues.apache.org/jira/browse/HDFS-4945 Project: Hadoop HDFS Issue Type: New Feature Components: auto-failover Affects Versions: HA branch (HDFS-1623) Reporter: Yonghwan Kim Labels: documentation See the following comment for detailed description. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently
[ https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696784#comment-13696784 ] Hudson commented on HDFS-4797: -- Integrated in Hadoop-Hdfs-trunk #1447 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1447/]) HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() consistently. (Revision 1498202) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockScanInfo does not override equals(..) and hashCode() consistently -- Key: HDFS-4797 URL: https://issues.apache.org/jira/browse/HDFS-4797 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.2.0 Attachments: h4797_20130513b.patch, h4797_20130513.patch In the code below, equals(..) compares lastScanTime but hashCode() is computed using block ID. Therefore, it could have two BlockScanInfo objects which are equal but have two different hash codes. {code} //BlockScanInfo @Override public int hashCode() { return block.hashCode(); } @Override public boolean equals(Object other) { return other instanceof BlockScanInfo compareTo((BlockScanInfo)other) == 0; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696797#comment-13696797 ] Uma Maheswara Rao G commented on HDFS-4937: --- Hi Kihwal, you said in the comment that operator added large number of new nodes right. Even then it was not able choose at least from them? ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() -- Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4797) BlockScanInfo does not override equals(..) and hashCode() consistently
[ https://issues.apache.org/jira/browse/HDFS-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696834#comment-13696834 ] Hudson commented on HDFS-4797: -- Integrated in Hadoop-Mapreduce-trunk #1474 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1474/]) HDFS-4797. BlockScanInfo does not override equals(..) and hashCode() consistently. (Revision 1498202) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498202 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockScanInfo does not override equals(..) and hashCode() consistently -- Key: HDFS-4797 URL: https://issues.apache.org/jira/browse/HDFS-4797 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Priority: Minor Fix For: 2.2.0 Attachments: h4797_20130513b.patch, h4797_20130513.patch In the code below, equals(..) compares lastScanTime but hashCode() is computed using block ID. Therefore, it could have two BlockScanInfo objects which are equal but have two different hash codes. {code} //BlockScanInfo @Override public int hashCode() { return block.hashCode(); } @Override public boolean equals(Object other) { return other instanceof BlockScanInfo compareTo((BlockScanInfo)other) == 0; } {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696838#comment-13696838 ] Daryn Sharp commented on HDFS-2856: --- I haven't digested the whole jira, but want to request more info about: bq. The only potential downside I see is that if we ever pipeline multiple operations over a single connection, then we'd need to renegotiate SASL per operation, because the authorization decision may be different per block I've made some RPCv9 changes to allow the future possibility to multiplex connections. Will multiplexing help with this jira's use case? If so, SASL negotiation per operation should not be necessary as negotiation will occur per virtual stream. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4696) Branch 0.23 Patch for Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks
[ https://issues.apache.org/jira/browse/HDFS-4696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated HDFS-4696: Target Version/s: 0.23.10 (was: 0.23.9) Branch 0.23 Patch for Block Replication Policy Implementation May Skip Higher-Priority Blocks for Lower-Priority Blocks - Key: HDFS-4696 URL: https://issues.apache.org/jira/browse/HDFS-4696 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.5 Reporter: Derek Dagit Assignee: Derek Dagit This JIRA tracks the solution to HDFS-4366 for the 0.23 branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4887) TestNNThroughputBenchmark exits abruptly
[ https://issues.apache.org/jira/browse/HDFS-4887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696863#comment-13696863 ] Kihwal Lee commented on HDFS-4887: -- bq. For the new patch, do we need to declare checkNSRunning as volatile, since it can be set and retrieved by different threads? The replication monitor thread accesses this variable only once when terminating, so there will be no issue. TestNNThroughputBenchmark exits abruptly Key: HDFS-4887 URL: https://issues.apache.org/jira/browse/HDFS-4887 Project: Hadoop HDFS Issue Type: Bug Components: benchmarks, test Affects Versions: 3.0.0, 2.1.0-beta Reporter: Kihwal Lee Assignee: Kihwal Lee Attachments: HDFS-4887.patch, HDFS-4887.patch After HDFS-4840, TestNNThroughputBenchmark exits in the middle. This is because ReplicationMonitor is being stopped while NN is still running. This is only valid during testing. In normal cases, ReplicationMonitor thread runs all the time once started. In standby or safemode, it just skips calculating DN work. I think NNThroughputBenchmark needs to use ExitUtil to prevent termination, rather than modifying ReplicationMonitor. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696877#comment-13696877 ] Tsz Wo (Nicholas), SZE commented on HDFS-4762: -- It seems that TestOffsetRange is incorrect: r2 and r4 have overlap but the compareTo(..) method does not allow it. Please also fix the findbugs warnings. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity
[ https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696891#comment-13696891 ] Kihwal Lee commented on HDFS-4888: -- +1 Refactor and fix FSNamesystem.getTurnOffTip to sanity - Key: HDFS-4888 URL: https://issues.apache.org/jira/browse/HDFS-4888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch e.g. When resources are low, the command to leave safe mode is not printed. This method is unnecessarily complex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4937) ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom()
[ https://issues.apache.org/jira/browse/HDFS-4937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13696922#comment-13696922 ] Kihwal Lee commented on HDFS-4937: -- bq. Even then it was not able choose at least from them? It couldn't pick enough number of nodes because the max replicas/rack was already calculated. I think it worked fine for majority of blocks with 3 replicas since the cluster had more than 3 racks even after refresh. The issue was with blocks with many more replicas. But picking enough nodes is just one condition. The other is for checking the exhaustion of candidate nodes. It would have bailed out of the while loop, if the cached cluster size was updated inside the loop. To avoid frequent cluster-size refresh for this rare condition, we can make it update the cached value after {{dfs.replication.max}} iterations, within which most blocks should find all they need. If NN hits this issue, it will loop {{dfs.replication.max}} times and break out. I prefer this over adding locking, which will slow down normal cases. ReplicationMonitor can infinite-loop in BlockPlacementPolicyDefault#chooseRandom() -- Key: HDFS-4937 URL: https://issues.apache.org/jira/browse/HDFS-4937 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.0.4-alpha, 0.23.8 Reporter: Kihwal Lee When a large number of nodes are removed by refreshing node lists, the network topology is updated. If the refresh happens at the right moment, the replication monitor thread may stuck in the while loop of {{chooseRandom()}}. This is because the cached cluster size is used in the terminal condition check of the loop. This usually happens when a block with a high replication factor is being processed. Since replicas/rack is also calculated beforehand, no node choice may satisfy the goodness criteria if refreshing removed racks. All nodes will end up in the excluded list, but the size will still be less than the cached cluster size, so it will loop infinitely. This was observed in a production environment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4851) Deadlock in pipeline recovery
[ https://issues.apache.org/jira/browse/HDFS-4851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697062#comment-13697062 ] Andrew Wang commented on HDFS-4851: --- Hey Uma, thanks for taking a look! I may not understand your proposal entirely, but I found it pretty complex to interrupt while not holding the lock (see the patch in HDFS-3655 for the general idea). The core issue is that more recovery threads can keep coming in, so even if we interrupt the current old writer, by the time we re-get the FSD lock to rbw.setWriter to ourselves, some other recovery thread might have again come in and we need to interrupt them too. Repeating the stopWriter requires re-doing the precondition checks in the three places we call stopWriter, each of which have different preconditions. Would love if a simpler or better solution is present though, so please let me know if I missed something. Deadlock in pipeline recovery - Key: HDFS-4851 URL: https://issues.apache.org/jira/browse/HDFS-4851 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 3.0.0, 2.0.4-alpha Reporter: Andrew Wang Assignee: Andrew Wang Attachments: hdfs-4851-1.patch Here's a deadlock scenario that cropped up during pipeline recovery, debugged through jstacks. Todd tipped me off to this one. # Pipeline fails, client initiates recovery. We have the old leftover DataXceiver, and a new one doing recovery. # New DataXceiver does {{recoverRbw}}, grabbing the {{FsDatasetImpl}} lock # Old DataXceiver is in {{BlockReceiver#computePartialChunkCrc}}, calls {{FsDatasetImpl#getTmpInputStreams}} and blocks on the {{FsDatasetImpl}} lock. # New DataXceiver {{ReplicaInPipeline#stopWriter}}, interrupting the old DataXceiver and then joining on it. # Boom, deadlock. New DX holds the {{FsDatasetImpl}} lock and is joining on the old DX, which is in turn waiting on the {{FsDatasetImpl}} lock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
[ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697123#comment-13697123 ] Chris Nauroth commented on HDFS-2856: - {quote} Will multiplexing help with this jira's use case? {quote} My comment referred to the fact that block-level operations, like readBlock and writeBlock, require a unique authorization decision per block, using a different block access token for each one. If multiple readBlock/writeBlock calls were pipelined over a single connection, then we'd need to check authorization on each one. If authorization for DataTransferProtocol is moving fully to SASL, then this implies to me that we would need to renegotiate SASL at the start of each block-level operation. I don't see a way for multiplexing to help with this problem, because there would still be the problem that we don't know what block the client requested until we start inspecting the front of the message. I haven't followed the RPCv9 changes closely though, so if I'm misunderstanding, please let me know. Thanks, Daryn. Fix block protocol so that Datanodes don't require root or jsvc --- Key: HDFS-2856 URL: https://issues.apache.org/jira/browse/HDFS-2856 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, security Reporter: Owen O'Malley Assignee: Chris Nauroth Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf Since we send the block tokens unencrypted to the datanode, we currently start the datanode as root using jsvc and get a secure ( 1024) port. If we have the datanode generate a nonce and send it on the connection and the sends an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we wouldn't require a secure port and would not require root or jsvc. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity
[ https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697134#comment-13697134 ] Hudson commented on HDFS-4888: -- Integrated in Hadoop-trunk-Commit #4025 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/4025/]) HDFS-4888. Refactor and fix FSNamesystem.getTurnOffTip. Contributed by Ravi Prakash. (Revision 1498665) Result = SUCCESS kihwal : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1498665 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSafeMode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHASafeMode.java Refactor and fix FSNamesystem.getTurnOffTip to sanity - Key: HDFS-4888 URL: https://issues.apache.org/jira/browse/HDFS-4888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch e.g. When resources are low, the command to leave safe mode is not printed. This method is unnecessarily complex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity
[ https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kihwal Lee updated HDFS-4888: - Resolution: Fixed Fix Version/s: 2.1.0-beta 3.0.0 Status: Resolved (was: Patch Available) Refactor and fix FSNamesystem.getTurnOffTip to sanity - Key: HDFS-4888 URL: https://issues.apache.org/jira/browse/HDFS-4888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Fix For: 3.0.0, 2.1.0-beta Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch e.g. When resources are low, the command to leave safe mode is not printed. This method is unnecessarily complex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4888) Refactor and fix FSNamesystem.getTurnOffTip to sanity
[ https://issues.apache.org/jira/browse/HDFS-4888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697150#comment-13697150 ] Kihwal Lee commented on HDFS-4888: -- Thanks for working on the fix, Ravi. I've committed this to trunk, branch-2 and branch-2.1-beta. Refactor and fix FSNamesystem.getTurnOffTip to sanity - Key: HDFS-4888 URL: https://issues.apache.org/jira/browse/HDFS-4888 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 3.0.0, 2.0.4-alpha, 0.23.9 Reporter: Ravi Prakash Assignee: Ravi Prakash Attachments: HDFS-4888.patch, HDFS-4888.patch, HDFS-4888.patch e.g. When resources are low, the command to leave safe mode is not printed. This method is unnecessarily complex -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4905) Add appendToFile command to hdfs dfs
[ https://issues.apache.org/jira/browse/HDFS-4905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697180#comment-13697180 ] Arpit Agarwal commented on HDFS-4905: - Thanks for the great feedback Chris, all reasonable points. I'll post an updated patch soon. Add appendToFile command to hdfs dfs -- Key: HDFS-4905 URL: https://issues.apache.org/jira/browse/HDFS-4905 Project: Hadoop HDFS Issue Type: Bug Components: tools Affects Versions: 3.0.0 Reporter: Arpit Agarwal Assignee: Arpit Agarwal Priority: Minor Attachments: HDFS-4905.patch A hdfs dfs -appendToFile... option would be quite useful for quick testing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4940) namenode OOMs under Bigtop's TestCLI
[ https://issues.apache.org/jira/browse/HDFS-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697268#comment-13697268 ] Suresh Srinivas commented on HDFS-4940: --- bq. Have you checked out the heap dump? yes. I agree with your assessment based on that alone. bq. I'm still not sure how the test is causing this problem It would be good to get bigtop results for this to understand what causes this. namenode OOMs under Bigtop's TestCLI Key: HDFS-4940 URL: https://issues.apache.org/jira/browse/HDFS-4940 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.1.0-beta Bigtop's TestCLI when executed against Hadoop 2.1.0 seems to make it OOM quite reliably regardless of the heap size settings. I'm attaching a heap dump URL. Alliteratively anybody can just take Bigtop's tests, compiled them against Hadoop 2.1.0 bits and try to reproduce it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-4762: - Attachment: HDFS-4762.patch.5 Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697276#comment-13697276 ] Brandon Li commented on HDFS-4762: -- {quote}It seems that TestOffsetRange is incorrect: r2 and r4 have overlap but the compareTo(..) method does not allow it.{quote} Updated the patch to handle overlap outside OffsetRange class. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697286#comment-13697286 ] Hadoop QA commented on HDFS-4762: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590316/HDFS-4762.patch.5 against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 7 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-nfs hadoop-hdfs-project/hadoop-hdfs-nfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4582//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4582//console This message is automatically generated. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4943) WebHdfsFileSystem does not work when original file path has encoded chars
[ https://issues.apache.org/jira/browse/HDFS-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697369#comment-13697369 ] Jerry He commented on HDFS-4943: Attached v2 with unit test. WebHdfsFileSystem does not work when original file path has encoded chars -- Key: HDFS-4943 URL: https://issues.apache.org/jira/browse/HDFS-4943 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 1.2.0, 1.1.2, 2.0.4-alpha Reporter: Jerry He Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4943-trunk.patch, HDFS-4943-trunk-v2.patch In HBase, the WAL (hlog) file name on hdfs is URL encoded. For example, hdtest010%2C60020%2C1371000602151.1371058984668 When we use webhdfs client to access the hlog file via httpfs, it does not work in this case. $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 13/06/27 18:36:08 DEBUG web.WebHdfsFileSystem: Original exception is org.apache.hadoop.ipc.RemoteException:java.io.FileNotFoundException:File does not exist: /user/biadmin/hbase_hlogs/hdtest010,60020,1371000602151.1371058984668 at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:114) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:299) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$500(WebHdfsFileSystem.java:104) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.getResponse(WebHdfsFileSystem.java:641) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.run(WebHdfsFileSystem.java:538) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:468) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:662) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:673) at org.apache.hadoop.fs.FileSystem.getFileStatus(FileSystem.java:1365) at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1048) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:987) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:965) at org.apache.hadoop.fs.FsShell.ls(FsShell.java:573) at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1571) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1789) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895) ls: Cannot access webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668: No such file or directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4943) WebHdfsFileSystem does not work when original file path has encoded chars
[ https://issues.apache.org/jira/browse/HDFS-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HDFS-4943: --- Attachment: HDFS-4943-trunk-v2.patch WebHdfsFileSystem does not work when original file path has encoded chars -- Key: HDFS-4943 URL: https://issues.apache.org/jira/browse/HDFS-4943 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 1.2.0, 1.1.2, 2.0.4-alpha Reporter: Jerry He Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4943-trunk.patch, HDFS-4943-trunk-v2.patch In HBase, the WAL (hlog) file name on hdfs is URL encoded. For example, hdtest010%2C60020%2C1371000602151.1371058984668 When we use webhdfs client to access the hlog file via httpfs, it does not work in this case. $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 13/06/27 18:36:08 DEBUG web.WebHdfsFileSystem: Original exception is org.apache.hadoop.ipc.RemoteException:java.io.FileNotFoundException:File does not exist: /user/biadmin/hbase_hlogs/hdtest010,60020,1371000602151.1371058984668 at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:114) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:299) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$500(WebHdfsFileSystem.java:104) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.getResponse(WebHdfsFileSystem.java:641) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.run(WebHdfsFileSystem.java:538) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:468) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:662) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:673) at org.apache.hadoop.fs.FileSystem.getFileStatus(FileSystem.java:1365) at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1048) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:987) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:965) at org.apache.hadoop.fs.FsShell.ls(FsShell.java:573) at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1571) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1789) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895) ls: Cannot access webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668: No such file or directory. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4940) namenode OOMs under Bigtop's TestCLI
[ https://issues.apache.org/jira/browse/HDFS-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697371#comment-13697371 ] Roman Shaposhnik commented on HDFS-4940: HADOOP-9676 now allows to isolated this down to a few tests. I'll keep you guys posted. namenode OOMs under Bigtop's TestCLI Key: HDFS-4940 URL: https://issues.apache.org/jira/browse/HDFS-4940 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.1.0-beta Bigtop's TestCLI when executed against Hadoop 2.1.0 seems to make it OOM quite reliably regardless of the heap size settings. I'm attaching a heap dump URL. Alliteratively anybody can just take Bigtop's tests, compiled them against Hadoop 2.1.0 bits and try to reproduce it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4943) WebHdfsFileSystem does not work when original file path has encoded chars
[ https://issues.apache.org/jira/browse/HDFS-4943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697426#comment-13697426 ] Hadoop QA commented on HDFS-4943: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12590344/HDFS-4943-trunk-v2.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/4583//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/4583//console This message is automatically generated. WebHdfsFileSystem does not work when original file path has encoded chars -- Key: HDFS-4943 URL: https://issues.apache.org/jira/browse/HDFS-4943 Project: Hadoop HDFS Issue Type: Bug Components: webhdfs Affects Versions: 1.2.0, 1.1.2, 2.0.4-alpha Reporter: Jerry He Priority: Minor Fix For: 2.1.0-beta Attachments: HDFS-4943-trunk.patch, HDFS-4943-trunk-v2.patch In HBase, the WAL (hlog) file name on hdfs is URL encoded. For example, hdtest010%2C60020%2C1371000602151.1371058984668 When we use webhdfs client to access the hlog file via httpfs, it does not work in this case. $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls hdfs:///user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs Found 1 items -rw-r--r-- 3 biadmin supergroup 15049470 2013-06-12 10:45 /user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 $ $ hadoop fs -ls webhdfs://hdtest010:14000/user/biadmin/hbase_hlogs/hdtest010%2C60020%2C1371000602151.1371058984668 13/06/27 18:36:08 DEBUG web.WebHdfsFileSystem: Original exception is org.apache.hadoop.ipc.RemoteException:java.io.FileNotFoundException:File does not exist: /user/biadmin/hbase_hlogs/hdtest010,60020,1371000602151.1371058984668 at org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:114) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:299) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$500(WebHdfsFileSystem.java:104) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.getResponse(WebHdfsFileSystem.java:641) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$Runner.run(WebHdfsFileSystem.java:538) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:468) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:662) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:673) at org.apache.hadoop.fs.FileSystem.getFileStatus(FileSystem.java:1365) at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1048) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:987) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:965) at org.apache.hadoop.fs.FsShell.ls(FsShell.java:573) at org.apache.hadoop.fs.FsShell.doall(FsShell.java:1571) at org.apache.hadoop.fs.FsShell.run(FsShell.java:1789) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.FsShell.main(FsShell.java:1895) ls: Cannot access
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697469#comment-13697469 ] Tsz Wo (Nicholas), SZE commented on HDFS-4762: -- - OffsetRange.compareTo(..) only compares min. It considers two ranges are equal if they have the same min but different max values. However, OpenFileCtx.pendingWrites uses OffsetRange as a key type. Is it possible to have two writes with the same min? In such case, the writes will be considered as equal and checkRepeatedWriteRequest(..) may be incorrect. If it is possible to have two writes with the same min, I suggest also comparing the max, i.e. {code} private static int compareTo(long left, long right) { if (left right) { return -1; } else if (left right) { return 1; } else { return 0; } } @Override public int compareTo(OffsetRange other) { final int d = compareTo(min, other.getMin()); return d != 0? d: compareTo(max, other.getMax()); } {code} BTW, the comment above OffsetRange.compareTo(..) is invalid. - In OpenFileCtx.checkDump(..), {code} try { if (dumpFile.exists()) { throw new RuntimeException(The dump file should not exist: + dumpFilePath); } dumpOut = new FileOutputStream(dumpFile); if (dumpFile.createNewFile()) { LOG.error(Can't create dump file: + dumpFilePath); } } catch (IOException e) { LOG.error(Got failure when creating dump stream + dumpFilePath + with error: + e); enabledDump = false; if (dumpOut != null) { try { dumpOut.close(); } catch (IOException e1) { LOG.error(Can't close dump stream + dumpFilePath + with error: + e); } } return; } {code} -* The second if-statement should be if (!dumpFile.createNewFile()). Also, createNewFile() ensures that the file does not exist. So the first if-statement may not be needed. -* Use IOUtils.cleanup(LOG, dumpOut); to close dumpOut. -* Is it okay to return when there is an exception? Should it re-throws the exception? - WriteManager.shutdownAsyncDataService() is not used. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4762) Provide HDFS based NFSv3 and Mountd implementation
[ https://issues.apache.org/jira/browse/HDFS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697488#comment-13697488 ] Tsz Wo (Nicholas), SZE commented on HDFS-4762: -- - OpenFileCtx.getPendingWrites() is not needed since it is only used inside OpenFileCtx. - offset should not be casted to int below. {code} // OpenFileCtx.processPerfectOverWrite(..) readCount = fis.read((int) offset, readbuffer, 0, count); {code} -* Use IOUtils.cleanup(..) to close fis. Provide HDFS based NFSv3 and Mountd implementation -- Key: HDFS-4762 URL: https://issues.apache.org/jira/browse/HDFS-4762 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Brandon Li Assignee: Brandon Li Attachments: HDFS-4762.patch, HDFS-4762.patch.2, HDFS-4762.patch.3, HDFS-4762.patch.3, HDFS-4762.patch.4, HDFS-4762.patch.5 This is to track the implementation of NFSv3 to HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4372) Track NameNode startup progress
[ https://issues.apache.org/jira/browse/HDFS-4372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth updated HDFS-4372: Attachment: HDFS-4372.4.patch Thanks again, Jing. Here is version 4 of the patch to address your feedback. {quote} 1. For FSImageFormat#save, do we also need to change the counter value when we finish the saving process (the same thing done to the counter in loading)? {quote} Yes, you're right. The new patch has additional changes to fix this in {{FSImageFormat#Saver#save}} and {{FSImageFormat#Saver#saveINode2Image}}. {quote} 2. For FSImageFormat#loadXXX(), looks like the parameter step has not been used. Can we remove it and only add a new parameter counter? {quote} Yes, I had forgotten to clean this up. The new patch fixes this for both the load methods and the save methods. Track NameNode startup progress --- Key: HDFS-4372 URL: https://issues.apache.org/jira/browse/HDFS-4372 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Affects Versions: 3.0.0 Reporter: Chris Nauroth Assignee: Chris Nauroth Attachments: HDFS-4372.1.patch, HDFS-4372.2.patch, HDFS-4372.3.patch, HDFS-4372.4.patch Track detailed progress information about the steps of NameNode startup to enable display to users. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4940) namenode OOMs under Bigtop's TestCLI
[ https://issues.apache.org/jira/browse/HDFS-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13697533#comment-13697533 ] Suresh Srinivas commented on HDFS-4940: --- [~rvs] Thanks for the update. I am really interested in seeing what is causing the memory growth. Namenode logs with HADOOP-9676 will help understand the issue. namenode OOMs under Bigtop's TestCLI Key: HDFS-4940 URL: https://issues.apache.org/jira/browse/HDFS-4940 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.1.0-beta Reporter: Roman Shaposhnik Priority: Blocker Fix For: 2.1.0-beta Bigtop's TestCLI when executed against Hadoop 2.1.0 seems to make it OOM quite reliably regardless of the heap size settings. I'm attaching a heap dump URL. Alliteratively anybody can just take Bigtop's tests, compiled them against Hadoop 2.1.0 bits and try to reproduce it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira