[jira] [Commented] (HDFS-2529) lastDeletedReport should be scoped to BPOfferService, not DN
[ https://issues.apache.org/jira/browse/HDFS-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171804#comment-13171804 ] Uma Maheswara Rao G commented on HDFS-2529: --- Todd, I think this was already fixed as part of HDFS-2560. Thanks Uma lastDeletedReport should be scoped to BPOfferService, not DN Key: HDFS-2529 URL: https://issues.apache.org/jira/browse/HDFS-2529 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.23.0 Reporter: Todd Lipcon Each BPOfferService separately tracks and reports deleted blocks. But, lastDeletedReport is a member variable in DataNode, so deletion reports may not be triggered on the desired interval. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-69) Improve dfsadmin command line help
[ https://issues.apache.org/jira/browse/HDFS-69?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-69: Target Version/s: 0.24.0, 0.23.1 Affects Version/s: 1.0.0 Status: Patch Available (was: Open) Improve dfsadmin command line help --- Key: HDFS-69 URL: https://issues.apache.org/jira/browse/HDFS-69 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.0.0 Reporter: Ravi Phulari Assignee: Harsh J Priority: Minor Attachments: HDFS-69.patch Enhance dfsadmin command line help informing A quota of one forces a directory to remain empty -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-197) du fails on Cygwin
[ https://issues.apache.org/jira/browse/HDFS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J reassigned HDFS-197: Assignee: (was: Harsh J) (Still lack a windows setup locally to test this. Unassigning for now to not block.) du fails on Cygwin Key: HDFS-197 URL: https://issues.apache.org/jira/browse/HDFS-197 Project: Hadoop HDFS Issue Type: Bug Environment: Windows + Cygwin Reporter: Kohsuke Kawaguchi Attachments: HADOOP-5486 When I try to run a datanode on Windows, I get the following exception: {noformat} java.io.IOException: Expecting a line not the end of stream at org.apache.hadoop.fs.DU.parseExecResult(DU.java:181) at org.apache.hadoop.util.Shell.runCommand(Shell.java:179) at org.apache.hadoop.util.Shell.run(Shell.java:134) at org.apache.hadoop.fs.DU.init(DU.java:53) at org.apache.hadoop.fs.DU.init(DU.java:63) at org.apache.hadoop.hdfs.server.datanode.FSDataset$FSVolume.init(FSDataset.java:325) at org.apache.hadoop.hdfs.server.datanode.FSDataset.init(FSDataset.java:681) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:291) at org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:205) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1193) {noformat} This is because Hadoop execs du -sk C:\tmp\hadoop-SYSTEM\dfs\data with a Windows path representation, which cygwin du doesn't understand. {noformat} C:\hudsondu -sk C:\tmp\hadoop-SYSTEM\dfs\data du -sk C:\tmp\hadoop-SYSTEM\dfs\data du: cannot access `C:\\tmp\\hadoop-SYSTEM\\dfs\\data': No such file or directory {noformat} For this to work correctly, Hadoop would have to run cygpath first to get a Unix path representation, then to call DU. Also, I had to use the debugger to get this information. Shell.runCommand should catch IOException from parseExecResult and add the buffered stderr to simplify the error diagnostics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-442) dfsthroughput in test.jar throws NPE
[ https://issues.apache.org/jira/browse/HDFS-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-442: - Target Version/s: 0.24.0, 0.23.1 Fix Version/s: (was: 0.24.0) Can someone take a look at the trivial patch and review it? Ramya? It should do good for 0.23 as well. dfsthroughput in test.jar throws NPE Key: HDFS-442 URL: https://issues.apache.org/jira/browse/HDFS-442 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.1 Reporter: Ramya Sunil Assignee: Harsh J Priority: Minor Attachments: HDFS-442.patch On running hadoop jar hadoop-test.jar dfsthroughput OR hadoop org.apache.hadoop.hdfs.BenchmarkThroughput, we get NullPointerException. Below is the stacktrace: {noformat} Exception in thread main java.lang.NullPointerException at java.util.Hashtable.put(Hashtable.java:394) at java.util.Properties.setProperty(Properties.java:143) at java.lang.System.setProperty(System.java:731) at org.apache.hadoop.hdfs.BenchmarkThroughput.run(BenchmarkThroughput.java:198) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.hdfs.BenchmarkThroughput.main(BenchmarkThroughput.java:229) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2413) Add public APIs for safemode
[ https://issues.apache.org/jira/browse/HDFS-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171832#comment-13171832 ] Harsh J commented on HDFS-2413: --- [~umamaheswararao] - You may want to report that downstream as well -- I believe they can surely catch SafeModeExceptions or so, and do better? [~st...@apache.org] - The JMX output at {{/jmx}} carries this state today. Is that insufficient? I'll consider this inclusion as well, after your reply. Add public APIs for safemode Key: HDFS-2413 URL: https://issues.apache.org/jira/browse/HDFS-2413 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.23.0 Reporter: Todd Lipcon Assignee: Harsh J Fix For: 0.24.0 Currently the APIs for safe-mode are part of DistributedFileSystem, which is supposed to be a private interface. However, dependent software often wants to wait until the NN is out of safemode. Though it could poll trying to create a file and catching SafeModeException, we should consider making some of these APIs public. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-69) Improve dfsadmin command line help
[ https://issues.apache.org/jira/browse/HDFS-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171849#comment-13171849 ] Hadoop QA commented on HDFS-69: --- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12504394/HDFS-69.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 javadoc. The javadoc tool appears to have generated 90 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.TestParallelRead org.apache.hadoop.hdfs.TestCrcCorruption org.apache.hadoop.hdfs.TestQuota org.apache.hadoop.hdfs.TestFileAppend3 org.apache.hadoop.hdfs.TestDatanodeConfig org.apache.hadoop.hdfs.TestDatanodeDeath org.apache.hadoop.hdfs.security.TestDelegationToken org.apache.hadoop.hdfs.tools.TestGetGroups +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1724//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1724//console This message is automatically generated. Improve dfsadmin command line help --- Key: HDFS-69 URL: https://issues.apache.org/jira/browse/HDFS-69 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.0.0 Reporter: Ravi Phulari Assignee: Harsh J Priority: Minor Attachments: HDFS-69.patch Enhance dfsadmin command line help informing A quota of one forces a directory to remain empty -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2553) BlockPoolSliceScanner spinning in loop
[ https://issues.apache.org/jira/browse/HDFS-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171853#comment-13171853 ] Hudson commented on HDFS-2553: -- Integrated in Hadoop-Hdfs-trunk #898 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/898/]) HDFS-2553. Fix BlockPoolSliceScanner spinning in a tight loop. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220317 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockPoolSliceScanner spinning in loop -- Key: HDFS-2553 URL: https://issues.apache.org/jira/browse/HDFS-2553 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0, 0.24.0 Reporter: Todd Lipcon Assignee: Uma Maheswara Rao G Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: CPUUsage.jpg, HDFS-2553.patch, HDFS-2553.patch Playing with trunk, I managed to get a DataNode in a situation where the BlockPoolSliceScanner is spinning in the following loop, using 100% CPU: at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.isAlive(DataNode.java:820) at org.apache.hadoop.hdfs.server.datanode.DataNode.isBPServiceAlive(DataNode.java:2962) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:625) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2700) TestDataNodeMultipleRegistrations is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171850#comment-13171850 ] Hudson commented on HDFS-2700: -- Integrated in Hadoop-Hdfs-trunk #898 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/898/]) HDFS-2700. Fix failing TestDataNodeMultipleRegistrations in trunk. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220315 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java TestDataNodeMultipleRegistrations is failing in trunk - Key: HDFS-2700 URL: https://issues.apache.org/jira/browse/HDFS-2700 Project: Hadoop HDFS Issue Type: Bug Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.24.0 Attachments: HDFS-2700.patch TestDataNodeMultipleRegistrations is failing from last couple of builds https://builds.apache.org/job/PreCommit-HDFS-Build/lastCompletedBuild/testReport/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2553) BlockPoolSliceScanner spinning in loop
[ https://issues.apache.org/jira/browse/HDFS-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171856#comment-13171856 ] Hudson commented on HDFS-2553: -- Integrated in Hadoop-Hdfs-0.23-Build #111 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/111/]) HDFS-2553. Fix BlockPoolSliceScanner spinning in a tight loop. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220316 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockPoolSliceScanner spinning in loop -- Key: HDFS-2553 URL: https://issues.apache.org/jira/browse/HDFS-2553 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0, 0.24.0 Reporter: Todd Lipcon Assignee: Uma Maheswara Rao G Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: CPUUsage.jpg, HDFS-2553.patch, HDFS-2553.patch Playing with trunk, I managed to get a DataNode in a situation where the BlockPoolSliceScanner is spinning in the following loop, using 100% CPU: at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.isAlive(DataNode.java:820) at org.apache.hadoop.hdfs.server.datanode.DataNode.isBPServiceAlive(DataNode.java:2962) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:625) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2553) BlockPoolSliceScanner spinning in loop
[ https://issues.apache.org/jira/browse/HDFS-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171863#comment-13171863 ] Hudson commented on HDFS-2553: -- Integrated in Hadoop-Mapreduce-0.23-Build #131 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/131/]) HDFS-2553. Fix BlockPoolSliceScanner spinning in a tight loop. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220316 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockPoolSliceScanner spinning in loop -- Key: HDFS-2553 URL: https://issues.apache.org/jira/browse/HDFS-2553 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0, 0.24.0 Reporter: Todd Lipcon Assignee: Uma Maheswara Rao G Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: CPUUsage.jpg, HDFS-2553.patch, HDFS-2553.patch Playing with trunk, I managed to get a DataNode in a situation where the BlockPoolSliceScanner is spinning in the following loop, using 100% CPU: at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.isAlive(DataNode.java:820) at org.apache.hadoop.hdfs.server.datanode.DataNode.isBPServiceAlive(DataNode.java:2962) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:625) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1526) Dfs client name for a map/reduce task should have some randomness
[ https://issues.apache.org/jira/browse/HDFS-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171865#comment-13171865 ] Harsh J commented on HDFS-1526: --- Noticed this today while poking around with 0.23.0: If its not mapreduce, labelling it as 'NONMAPREDUCE' only makes it harder to grep, cause there's still some 'MAPREDUCE' in it? Its a nitpick (cause IDs don't carry that string), but perhaps you may consider switching to something more 'REGULAR'? Dfs client name for a map/reduce task should have some randomness - Key: HDFS-1526 URL: https://issues.apache.org/jira/browse/HDFS-1526 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Reporter: Hairong Kuang Assignee: Hairong Kuang Fix For: 0.23.0 Attachments: clientName.patch, randClientId1.patch, randClientId2.patch, randClientId3.patch Fsck shows one of the files in our dfs cluster is corrupt. /bin/hadoop fsck aFile -files -blocks -locations aFile: 4633 bytes, 2 block(s): aFile: CORRUPT block blk_-4597378336099313975 OK 0. blk_-4597378336099313975_2284630101 len=0 repl=3 [...] 1. blk_5024052590403223424_2284630107 len=4633 repl=3 [...]Status: CORRUPT On disk, these two blocks are of the same size and the same content. It turns out the writer of the file is from a multiple threaded map task. Each thread may write to the same file. One possible interaction of two threads might make this to happen: [T1: create aFile] [T2: delete aFile] [T2: create aFile][T1: addBlock 0 to aFile][T2: addBlock1 to aFile]... Because T1 and T2 have the same client name, which is the map task id, the above interactions could be done without any lease exception, thus eventually leading to a corrupt file. To solve the problem, a mapreduce task's client name could be formed by its task id followed by a random number. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2700) TestDataNodeMultipleRegistrations is failing in trunk
[ https://issues.apache.org/jira/browse/HDFS-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171867#comment-13171867 ] Hudson commented on HDFS-2700: -- Integrated in Hadoop-Mapreduce-trunk #931 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/931/]) HDFS-2700. Fix failing TestDataNodeMultipleRegistrations in trunk. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220315 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java TestDataNodeMultipleRegistrations is failing in trunk - Key: HDFS-2700 URL: https://issues.apache.org/jira/browse/HDFS-2700 Project: Hadoop HDFS Issue Type: Bug Reporter: Uma Maheswara Rao G Assignee: Uma Maheswara Rao G Fix For: 0.24.0 Attachments: HDFS-2700.patch TestDataNodeMultipleRegistrations is failing from last couple of builds https://builds.apache.org/job/PreCommit-HDFS-Build/lastCompletedBuild/testReport/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2553) BlockPoolSliceScanner spinning in loop
[ https://issues.apache.org/jira/browse/HDFS-2553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171870#comment-13171870 ] Hudson commented on HDFS-2553: -- Integrated in Hadoop-Mapreduce-trunk #931 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/931/]) HDFS-2553. Fix BlockPoolSliceScanner spinning in a tight loop. Contributed by Uma Maheswara Rao G. todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220317 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceScanner.java BlockPoolSliceScanner spinning in loop -- Key: HDFS-2553 URL: https://issues.apache.org/jira/browse/HDFS-2553 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.23.0, 0.24.0 Reporter: Todd Lipcon Assignee: Uma Maheswara Rao G Priority: Critical Fix For: 0.24.0, 0.23.1 Attachments: CPUUsage.jpg, HDFS-2553.patch, HDFS-2553.patch Playing with trunk, I managed to get a DataNode in a situation where the BlockPoolSliceScanner is spinning in the following loop, using 100% CPU: at org.apache.hadoop.hdfs.server.datanode.DataNode$BPOfferService.isAlive(DataNode.java:820) at org.apache.hadoop.hdfs.server.datanode.DataNode.isBPServiceAlive(DataNode.java:2962) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scan(BlockPoolSliceScanner.java:625) at org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceScanner.scanBlockPoolSlice(BlockPoolSliceScanner.java:614) at org.apache.hadoop.hdfs.server.datanode.DataBlockScanner.run(DataBlockScanner.java:95) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-2007) Backupnode downloading image/edits from Namenode at every checkpoint ..
[ https://issues.apache.org/jira/browse/HDFS-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SreeHari reassigned HDFS-2007: -- Assignee: SreeHari Backupnode downloading image/edits from Namenode at every checkpoint .. Key: HDFS-2007 URL: https://issues.apache.org/jira/browse/HDFS-2007 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: SreeHari Assignee: SreeHari Priority: Minor After the fix for HDFS-903 ( md5 verification of fsimage ) , Backupnode is downloading the image edit files from namenode everytime since the difference in checkpoint time is always maintined b/w Namenode and Backupnode . This happens since Namenode is resetting its checkpoint time after every checkpoint since we are ignoring renewCheckpointTime and passing true explicitly to rollFsimage during endcheckpoint , while Backupnode is setting its checkpointtime to whatever it got from the namenode during startcheckpoint() Thus , checkpointtimes will be different during the next checkpoint and ll cause the image to be downloaded again .. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2698) BackupNode is downloading image from NameNode for every checkpoint
[ https://issues.apache.org/jira/browse/HDFS-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171873#comment-13171873 ] SreeHari commented on HDFS-2698: Isnt this same as [https://issues.apache.org/jira/browse/HDFS-2007] ? BackupNode is downloading image from NameNode for every checkpoint -- Key: HDFS-2698 URL: https://issues.apache.org/jira/browse/HDFS-2698 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Konstantin Shvachko Assignee: Konstantin Shvachko Attachments: rollFSImage.patch, rollFSImage.patch BackupNode can make periodic checkpoints without downloading image and edits files from the NameNode, but with just saving the namespace to local disks. This is not happening because NN renews checkpoint time after every checkpoint, thus making its image ahead of the BN's even though they are in sync. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2668) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision
[ https://issues.apache.org/jira/browse/HDFS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-2668: -- Attachment: TestToReproduceHDFS-2668.patch Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision - Key: HDFS-2668 URL: https://issues.apache.org/jira/browse/HDFS-2668 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Attachments: TestToReproduceHDFS-2668.patch I haven't written a test case to verify this yet, but I believe the following assertion is incorrect: {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; {code} The problem is that, when a block is invalidated due to over-replication, it is not immediately removed from the block map. So, if a block report arrives just after a block has been marked as invalidated, but before the block is actually deleted, I think this assertion will trigger incorrectly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2668) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision
[ https://issues.apache.org/jira/browse/HDFS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-2668: -- Status: Patch Available (was: Open) Attached the Test patch, which should reproduce the issue. I will remove the wrong assertion in BlockManager with next patch. Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision - Key: HDFS-2668 URL: https://issues.apache.org/jira/browse/HDFS-2668 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Attachments: TestToReproduceHDFS-2668.patch I haven't written a test case to verify this yet, but I believe the following assertion is incorrect: {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; {code} The problem is that, when a block is invalidated due to over-replication, it is not immediately removed from the block map. So, if a block report arrives just after a block has been marked as invalidated, but before the block is actually deleted, I think this assertion will trigger incorrectly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2668) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision
[ https://issues.apache.org/jira/browse/HDFS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171904#comment-13171904 ] Hadoop QA commented on HDFS-2668: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12507837/TestToReproduceHDFS-2668.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. -1 javadoc. The javadoc tool appears to have generated 90 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 eclipse:eclipse. The patch built with eclipse:eclipse. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hdfs.security.token.block.TestBlockToken org.apache.hadoop.hdfs.TestFileAppend2 org.apache.hadoop.hdfs.web.TestFSMainOperationsWebHdfs org.apache.hadoop.hdfs.security.TestDelegationToken org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1725//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1725//console This message is automatically generated. Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision - Key: HDFS-2668 URL: https://issues.apache.org/jira/browse/HDFS-2668 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Attachments: TestToReproduceHDFS-2668.patch I haven't written a test case to verify this yet, but I believe the following assertion is incorrect: {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; {code} The problem is that, when a block is invalidated due to over-replication, it is not immediately removed from the block map. So, if a block report arrives just after a block has been marked as invalidated, but before the block is actually deleted, I think this assertion will trigger incorrectly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2658) HttpFS introduced 70 javadoc warnings
[ https://issues.apache.org/jira/browse/HDFS-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171907#comment-13171907 ] Eli Collins commented on HDFS-2658: --- +1 thanks Tucu HttpFS introduced 70 javadoc warnings - Key: HDFS-2658 URL: https://issues.apache.org/jira/browse/HDFS-2658 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.24.0, 0.23.1 Reporter: Eli Collins Assignee: Alejandro Abdelnur Fix For: 0.24.0, 0.23.1 Attachments: HDFS-2658.patch {noformat} hadoop1 (trunk)$ grep warning javadoc.txt |grep -c httpfs 70 {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2668) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision
[ https://issues.apache.org/jira/browse/HDFS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171910#comment-13171910 ] Uma Maheswara Rao G commented on HDFS-2668: --- I make sure the issue: i just added throws RuntimeException in processReportedBlock in BlockManager#processReportedBlock {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; if(storedBlock.findDatanode(dn) = 0) throw new RuntimeException(Block already added into invalidateBlocks. But still this block associated with DN storedBlock.findDatanode(dn) = + storedBlock.findDatanode(dn)); return storedBlock; } {code} After this i ran the above attached Test. Below are the logs that proves the issue. 2011-12-18 23:02:42,066 INFO FSNamesystem.audit (FSNamesystem.java:logAuditEvent(220)) - ugi=uma (auth:SIMPLE) ip=/127.0.0.1 cmd=opensrc=/tmp/testBadBlockReportOnTransfer/file1 dst=null perm=null All blocks of file /tmp/testBadBlockReportOnTransfer/file1 verified to have replication factor 3 2011-12-18 23:02:42,073 INFO blockmanagement.BlockManager (BlockManager.java:setReplication(1814)) - Decreasing replication from 3 to 1 for /tmp/testBadBlockReportOnTransfer/file1 2011-12-18 23:02:42,073 INFO hdfs.StateChange (InvalidateBlocks.java:add(77)) - BLOCK* InvalidateBlocks: add blk_5137102758256792519_1001 to 127.0.0.1:54432 2011-12-18 23:02:42,073 INFO hdfs.StateChange (BlockManager.java:chooseExcessReplicates(1954)) - BLOCK* chooseExcessReplicates: (127.0.0.1:54432, blk_5137102758256792519_1001) is added to recentInvalidateSets 2011-12-18 23:02:42,073 INFO hdfs.StateChange (InvalidateBlocks.java:add(77)) - BLOCK* InvalidateBlocks: add blk_5137102758256792519_1001 to 127.0.0.1:54418 2011-12-18 23:02:42,073 INFO hdfs.StateChange (BlockManager.java:chooseExcessReplicates(1954)) - BLOCK* chooseExcessReplicates: (127.0.0.1:54418, blk_5137102758256792519_1001) is added to recentInvalidateSets 2011-12-18 23:02:42,076 INFO FSNamesystem.audit (FSNamesystem.java:logAuditEvent(220)) - ugi=uma (auth:SIMPLE) ip=/127.0.0.1 cmd=setReplication src=/tmp/testBadBlockReportOnTransfer/file1 dst=nullperm=null .. ... 2011-12-18 23:02:43,343 WARN datanode.DataNode (BPOfferService.java:offerService(537)) - RemoteException in offerService java.lang.RuntimeException: java.lang.RuntimeException: Block already added into invalidateBlocks. But still this block associated with DN storedBlock.findDatanode(dn) =1 at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReportedBlock(BlockManager.java:1498) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1418) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1328) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1303) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:847) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:130) at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:16189) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:417) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:834) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1605) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1) at java.security.AccessController.doPrivileged(Native Method) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision - Key: HDFS-2668 URL: https://issues.apache.org/jira/browse/HDFS-2668 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Attachments: TestToReproduceHDFS-2668.patch I haven't written a test case to verify this yet, but I believe the following assertion is incorrect: {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; {code} The problem is that,
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171911#comment-13171911 ] Uma Maheswara Rao G commented on HDFS-2335: --- Test failures are unrelated to this patch! DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2668) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision
[ https://issues.apache.org/jira/browse/HDFS-2668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uma Maheswara Rao G updated HDFS-2668: -- Status: Open (was: Patch Available) Incorrect assertion in BlockManager when block report arrives shortly after invalidation decision - Key: HDFS-2668 URL: https://issues.apache.org/jira/browse/HDFS-2668 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.23.0 Reporter: Todd Lipcon Attachments: TestToReproduceHDFS-2668.patch I haven't written a test case to verify this yet, but I believe the following assertion is incorrect: {code} // Ignore replicas already scheduled to be removed from the DN if(invalidateBlocks.contains(dn.getStorageID(), block)) { assert storedBlock.findDatanode(dn) 0 : Block + block + in recentInvalidatesSet should not appear in DN + dn; {code} The problem is that, when a block is invalidated due to over-replication, it is not immediately removed from the block map. So, if a block report arrives just after a block has been marked as invalidated, but before the block is actually deleted, I think this assertion will trigger incorrectly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2701) Cleanup FS* processIOError methods
[ https://issues.apache.org/jira/browse/HDFS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-2701: -- Attachment: hdfs-2701.txt Thanks for the review Todd. Updated patch attach. #1 Agree, I've done this in HDFS-2702, I was trying to keep this change to just cleanup/refactoring (the current crazy behavior is actually what causes HDFS-2702!). #2 Good catch. Fixed. #3-5 Done. Wrt testing see my comment in HDFS-2702. The short answer is that aside from the existing tests which are clean I've done manual testing (failing storage dirs and checkpointing) for 2701-2703 and am working on a unit test that will cover storage dir failures and removal. Cleanup FS* processIOError methods -- Key: HDFS-2701 URL: https://issues.apache.org/jira/browse/HDFS-2701 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-2701.txt, hdfs-2701.txt, hdfs-2701.txt, hdfs-2701.txt Let's rename the various processIOError methods to be more descriptive. The current code makes it difficult to identify and reason about bug fixes. While we're at it let's remove Fatal from the Unable to sync the edit log log since it's not actually a fatal error (this is confusing to users). And 2NN Checkpoint done should be info, not a warning (also confusing to users). Thanks to HDFS-1073 these issues don't exist on trunk or 23. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171915#comment-13171915 ] Eli Collins commented on HDFS-2335: --- +1 DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2657) TestHttpFSServer and TestServerWebApp are failing on trunk
[ https://issues.apache.org/jira/browse/HDFS-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171917#comment-13171917 ] Eli Collins commented on HDFS-2657: --- Yea, every trunk build for the last couple of days has failed these. TestHttpFSServer fails the same assert and TestServerWebApp gets the following NPE. {noformat} org.apache.hadoop.lib.servlet.TestServerWebApp.lifecycle Failing for the past 5 builds (Since #894 ) Took 11 ms. add description Stacktrace java.lang.NullPointerException at java.util.Properties$LineReader.readLine(Properties.java:418) at java.util.Properties.load0(Properties.java:337) at java.util.Properties.load(Properties.java:325) at org.apache.hadoop.lib.server.Server.init(Server.java:348) at org.apache.hadoop.lib.servlet.ServerWebApp.contextInitialized(ServerWebApp.java:142) at org.apache.hadoop.lib.servlet.TestServerWebApp.__CLR3_0_2sd9si72uk(TestServerWebApp.java:56) at org.apache.hadoop.lib.servlet.TestServerWebApp.lifecycle(TestServerWebApp.java:46) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.apache.hadoop.test.TestDirHelper$1.evaluate(TestDirHelper.java:108) at org.apache.hadoop.test.TestJettyHelper$1.evaluate(TestJettyHelper.java:51) at org.apache.hadoop.test.TestExceptionHelper$1.evaluate(TestExceptionHelper.java:41) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) at org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175) at org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:81) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68) Standard Output test.properties : NONE test.dir: /home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/testdir {noformat} TestHttpFSServer and TestServerWebApp are failing on trunk -- Key: HDFS-2657 URL: https://issues.apache.org/jira/browse/HDFS-2657 Project: Hadoop HDFS Issue Type: Bug Reporter: Eli Collins Assignee: Alejandro Abdelnur org.apache.hadoop.fs.http.server.TestHttpFSServer.instrumentation org.apache.hadoop.lib.servlet.TestServerWebApp.lifecycle -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-2335: -- Resolution: Fixed Target Version/s: 0.23.1 (was: 0.24.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) I've committed this and merged to 23. Thanks Uma! DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171920#comment-13171920 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Common-trunk-Commit #1451 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/1451/]) HDFS-2335. DataNodeCluster and NNStorage always pull fresh entropy. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220510 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171924#comment-13171924 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Hdfs-trunk-Commit #1524 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1524/]) HDFS-2335. DataNodeCluster and NNStorage always pull fresh entropy. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220510 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171925#comment-13171925 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Hdfs-0.23-Commit #294 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Commit/294/]) HDFS-2335. svn merge -c 1220510 from trunk eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220513 Files : * /hadoop/common/branches/branch-0.23 * /hadoop/common/branches/branch-0.23/hadoop-common-project * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/.gitignore * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++ * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/TestSequenceFileMergeProgress.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/security/authorize/TestServiceLevelAuthorization.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/test/MapredTestDriver.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/webapps/job *
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171926#comment-13171926 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Common-0.23-Commit #305 (See [https://builds.apache.org/job/Hadoop-Common-0.23-Commit/305/]) HDFS-2335. svn merge -c 1220510 from trunk eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220513 Files : * /hadoop/common/branches/branch-0.23 * /hadoop/common/branches/branch-0.23/hadoop-common-project * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/.gitignore * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++ * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/TestSequenceFileMergeProgress.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/security/authorize/TestServiceLevelAuthorization.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/test/MapredTestDriver.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/webapps/job
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171927#comment-13171927 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Mapreduce-trunk-Commit #1474 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/1474/]) HDFS-2335. DataNodeCluster and NNStorage always pull fresh entropy. Contributed by Uma Maheswara Rao G eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220510 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java DataNodeCluster and NNStorage always pull fresh entropy --- Key: HDFS-2335 URL: https://issues.apache.org/jira/browse/HDFS-2335 Project: Hadoop HDFS Issue Type: Improvement Components: data-node, name-node Affects Versions: 0.23.0, 0.24.0, 1.0.0 Reporter: Eli Collins Assignee: Uma Maheswara Rao G Attachments: HDFS-2335.patch, HDFS-2335.patch Jira for giving DataNodeCluster and NNStorage the same treatment as HDFS-1835. They're not truly cryptographic uses as well. We should also factor this out to a utility method, seems like the three uses are slightly different, eg one uses DFSUtil.getRandom and the other creates a new Random object. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2335) DataNodeCluster and NNStorage always pull fresh entropy
[ https://issues.apache.org/jira/browse/HDFS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171929#comment-13171929 ] Hudson commented on HDFS-2335: -- Integrated in Hadoop-Mapreduce-0.23-Commit #316 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Commit/316/]) HDFS-2335. svn merge -c 1220510 from trunk eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1220513 Files : * /hadoop/common/branches/branch-0.23 * /hadoop/common/branches/branch-0.23/hadoop-common-project * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-auth * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/docs * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-common-project/hadoop-common/src/test/core * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/native * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/secondary * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/hdfs * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/DataNodeCluster.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/.gitignore * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/bin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/conf * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-mapreduce-examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/hadoop-yarn/hadoop-yarn-site/src/site/apt * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/c++ * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/block_forensics * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build-contrib.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/build.xml * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/data_join * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/eclipse-plugin * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/index * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/contrib/vaidya * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/examples * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/fs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/hdfs * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/FileBench.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/io/TestSequenceFileMergeProgress.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/ipc * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/security/authorize/TestServiceLevelAuthorization.java * /hadoop/common/branches/branch-0.23/hadoop-mapreduce-project/src/test/mapred/org/apache/hadoop/test/MapredTestDriver.java *
[jira] [Updated] (HDFS-2702) A single failed name dir can cause the NN to exit
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-2702: -- Attachment: hdfs-2702.txt Updated patch with new test class that covers: #1 The NN doesn't exit as long as it has a valid storage dir #2 The NN exits when it no longer has a valid storage dir #3 Removed storage dirs is updated (fails w/o HDFS-2703) A single failed name dir can cause the NN to exit -- Key: HDFS-2702 URL: https://issues.apache.org/jira/browse/HDFS-2702 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Priority: Critical Attachments: hdfs-2702.txt, hdfs-2702.txt There's a bug in FSEditLog#rollEditLog which results in the NN process exiting if a single name dir has failed. Here's the relevant code: {code} close() // So editStreams.size() is 0 foreach edits dir { .. eStream = new ... // Might get an IOE here editStreams.add(eStream); } catch (IOException ioe) { removeEditsForStorageDir(sd); // exits if editStreams.size() = 1 } {code} If we get an IOException before we've added two edits streams to the list we'll exit, eg if there's an error processing the 1st name dir we'll exit even if there are 4 valid name dirs. The fix is to move the checking out of removeEditsForStorageDir (nee processIOError) or modify it so it can be disabled in some cases, eg here where we don't yet know how many streams are valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171951#comment-13171951 ] dhruba borthakur commented on HDFS-2699: Thanks for your comments Scott, Andrew, Todd and Allen. Scott: most of our our hbase production clusters have io.bytes.per.checksum to 4096 (instead of 512) Allen: One can put crcs on a logging device, e.g. bookkeeper perhaps? But at the end of day, each random io from an hdfs file will consume two disk iops (one on the hdfs block storage and one from the loogging device), is it not? Won't it be optimal to inline crc and data. If we decide to implement inline crc, can we make the hdfs support two different data formats and not do any automatic data format upgrade for exisiting data? pre-existing data can remain in the older format while newly created files will have data in the new -inline-data-and-crc format. What to do people think about this idea? Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2702) A single failed name dir can cause the NN to exit
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-2702: -- Attachment: hdfs-2702.txt Slightly updated patch. I made FSEditLog#logEdit throw an AssertionError (rather than just assert) so we stop the NN if there's a bug where we forget to remove an edit stream after we notice a failed directory. This should never fire, but could if we introduced a bug where eg we missed a call to removeEdits. Updated the test to check that we can't log an edit if there are no streams. A single failed name dir can cause the NN to exit -- Key: HDFS-2702 URL: https://issues.apache.org/jira/browse/HDFS-2702 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Priority: Critical Attachments: hdfs-2702.txt, hdfs-2702.txt, hdfs-2702.txt There's a bug in FSEditLog#rollEditLog which results in the NN process exiting if a single name dir has failed. Here's the relevant code: {code} close() // So editStreams.size() is 0 foreach edits dir { .. eStream = new ... // Might get an IOE here editStreams.add(eStream); } catch (IOException ioe) { removeEditsForStorageDir(sd); // exits if editStreams.size() = 1 } {code} If we get an IOException before we've added two edits streams to the list we'll exit, eg if there's an error processing the 1st name dir we'll exit even if there are 4 valid name dirs. The fix is to move the checking out of removeEditsForStorageDir (nee processIOError) or modify it so it can be disabled in some cases, eg here where we don't yet know how many streams are valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2662) Namenode should log warning message when trying to start on a unformmatted system
[ https://issues.apache.org/jira/browse/HDFS-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171954#comment-13171954 ] William McNeill commented on HDFS-2662: --- I verified that if I delete the DFS directory (/tmp/hadoop-williammcneill/dfs on my machine) and run start-dfs.sh I get just the SCDynamicStore error message in the DFS log, but the namenode is not running. If I then run hadoop namenode -format and re-run start-dfs.sh the log file is the same--just the SCDynamicStore error message--but the namenode is now running and HDFS operations work. I'll double-check the Kerberos workaround discussed in Jira 7489, but I think that's an unrelated issue since you see the warning message regardless of whether the namenode is running. Namenode should log warning message when trying to start on a unformmatted system - Key: HDFS-2662 URL: https://issues.apache.org/jira/browse/HDFS-2662 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.20.203.0 Environment: Single-node cluster on OS X 10.7 (Lion) Reporter: William McNeill Priority: Minor Labels: format, logging, namenode When you try to start the namenode for a system that does not have a formatted DFS, it fails silently without any indication that the lack of formatting was the problem. I tried to run start-dfs.sh on a single-node cluster with an unformatted HDFS. The namenode failed to start, but generated no warning messages, and its log was empty. After running hadoop namenode -format everything worked. Details in this thread: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201112.mbox/%3CCAN9z%2BopAn-t_f3FRC%3DDtV0n0ysoKd3Fek-fJPb68PMThiPooKg%40mail.gmail.com%3E This is a difficult problem to diagnose because the namenode gives you no feedback. It would be better if it printed an error message to its log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2662) Namenode should log warning message when trying to start on a unformmatted system
[ https://issues.apache.org/jira/browse/HDFS-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171955#comment-13171955 ] William McNeill commented on HDFS-2662: --- I have no idea why part of the above comment has strikethrough formatting. Namenode should log warning message when trying to start on a unformmatted system - Key: HDFS-2662 URL: https://issues.apache.org/jira/browse/HDFS-2662 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.20.203.0 Environment: Single-node cluster on OS X 10.7 (Lion) Reporter: William McNeill Priority: Minor Labels: format, logging, namenode When you try to start the namenode for a system that does not have a formatted DFS, it fails silently without any indication that the lack of formatting was the problem. I tried to run start-dfs.sh on a single-node cluster with an unformatted HDFS. The namenode failed to start, but generated no warning messages, and its log was empty. After running hadoop namenode -format everything worked. Details in this thread: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201112.mbox/%3CCAN9z%2BopAn-t_f3FRC%3DDtV0n0ysoKd3Fek-fJPb68PMThiPooKg%40mail.gmail.com%3E This is a difficult problem to diagnose because the namenode gives you no feedback. It would be better if it printed an error message to its log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171961#comment-13171961 ] Todd Lipcon commented on HDFS-2699: --- The idea of introducing the new format as a backward-compatible option sounds good to me. That's what we did for the CRC32C checksums - new files are written with that checksum algorithm but old files continue to operate with the old one. Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2662) Namenode should log warning message when trying to start on a unformmatted system
[ https://issues.apache.org/jira/browse/HDFS-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171962#comment-13171962 ] William McNeill commented on HDFS-2662: --- I tried adding the parameters from 7489 to by hdfs-site.xml configuration file and see the exact same behavior. I still get the SCDynamic store warning, and the lack of a logged error about DFS directories not existing when they haven't been properly formatted. So I can't get the 7489 workaround to work, but I still suspect that's an unrelated issue. Namenode should log warning message when trying to start on a unformmatted system - Key: HDFS-2662 URL: https://issues.apache.org/jira/browse/HDFS-2662 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.20.203.0 Environment: Single-node cluster on OS X 10.7 (Lion) Reporter: William McNeill Priority: Minor Labels: format, logging, namenode When you try to start the namenode for a system that does not have a formatted DFS, it fails silently without any indication that the lack of formatting was the problem. I tried to run start-dfs.sh on a single-node cluster with an unformatted HDFS. The namenode failed to start, but generated no warning messages, and its log was empty. After running hadoop namenode -format everything worked. Details in this thread: http://mail-archives.apache.org/mod_mbox/hadoop-common-user/201112.mbox/%3CCAN9z%2BopAn-t_f3FRC%3DDtV0n0ysoKd3Fek-fJPb68PMThiPooKg%40mail.gmail.com%3E This is a difficult problem to diagnose because the namenode gives you no feedback. It would be better if it printed an error message to its log file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171970#comment-13171970 ] M. C. Srivas commented on HDFS-2699: Couple of observations: a. If you want to eventually support random-IO, then a block size of 4096 is too large for the CRC, as it will cause a read-modify-write cycle on the entire 4K. 512-bytes reduces this overhead. b. Can the value of the variable io.bytes.per.checksum be transferred from the *-site.xml file into the file-properties at the NN at the time of file creation? If someone messes around with it, old files will still work as before Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171976#comment-13171976 ] dhruba borthakur commented on HDFS-2699: Thanks srivas for your comments. a block size of 4096 is too large for the CRC the hbase block size is 16K. The hdfs checksum size is 4K. The hdfs block size is 256 MB. which one r u referring to here? Can you pl explain the read-modify-write cycle? HDFS does mostly large sequential writes (no overwrites). io.bytes.per.checksum be transferred from the *-site.xml It is already stored in the datanode meta file associated with each block. Different hdfs files in the same hdfs cluster can have different io.bytes.per.checksum Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13171997#comment-13171997 ] M. C. Srivas commented on HDFS-2699: @dhruba: a block size of 4096 is too large for the CRC the hbase block size is 16K. The hdfs checksum size is 4K. The hdfs block size is 256 MB. which one r u referring to here? Can you pl explain the read-modify-write cycle? HDFS does mostly large sequential writes (no overwrites). The CRC block size. (that is, the contiguous region of the file that a CRC covers). Modifying any portion of that region will require that the entire data for the region be read in, and the CRC recomputed for that entire region and the entire region written out again. Note that it also introduces a new failure mode ... data that was previously written safely a long time ago could be now deemed corrupt since the CRC is no-longer good due to a minor modification during an append. The failure scenario is as follows: 1. A thread writes to a file and closes it. Lets say the file length is 9K. There are 3 CRCs embedded inline -- one for 0-4K, one for 4K-8K, and one for 8K-9K. Call the last one CRC3. 2. An append happens a few days later to extend the file from 9K to 11K. CRC3 is now recomputed for the 3K-sized region spanning offsets 8K-11K and written out as CRC3-new. But there is a crash, and the entire 3K is not all written out cleanly (CRC3-new and some data in written out before the crash -- all 3 copies crash and recover). 3. A subsequent read on the region 8K-9K now fails with a CRC error ... even though the write was stable and used to succeed before. If this file was the HBase WAL, wouldn't this result in a data loss? Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172006#comment-13172006 ] Todd Lipcon commented on HDFS-2699: --- bq. Modifying any portion of that region will require that the entire data for the region be read in, and the CRC recomputed for that entire region and the entire region written out again But the cost of random-reading 4K is essentially the same as the cost of reading 512 bytes. Once you seek to the offset, the data transfer time is insignificant. Plus, given the 4KB page size used by Linux, all IO is already at this granularity. bq. An append happens a few days later to extend the file from 9K to 11K. CRC3 is now recomputed for the 3K-sized region spanning offsets 8K-11K and written out as CRC3-new. But there is a crash... This is an existing issue regardless of whether the checksums are interleaved or separate. The current solution is that we allow a checksum error on the last checksum chunk of a file in the case that it's being recovered after a crash -- iirc only in the case that _all_ replicas have this issue. If there is any valid replica, then we use that and truncate/rollback the other files to the sync boundary. Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2701) Cleanup FS* processIOError methods
[ https://issues.apache.org/jira/browse/HDFS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172023#comment-13172023 ] Todd Lipcon commented on HDFS-2701: --- in open(), if all of them fail to open, we'll have no edits streams... is that taken care of by 2702? in removeEditsForStorageDir, I think there might be a bug with the following sequence: - dir holding both edits and image fails - restoreFailedStorage is called so it is added back to the list for image operations, but edit logs haven't rolled yet, so it's not in editStreams - it fails again, so removeEditsForStorageDir is called with a dir that doesn't have any open stream. In that case, exitIfInvalidStreams() would exit even though nothing is getting removed. I guess this is taken care of by HDFS-2702? If the answer to both of the above is yes, then +1 :) Cleanup FS* processIOError methods -- Key: HDFS-2701 URL: https://issues.apache.org/jira/browse/HDFS-2701 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-2701.txt, hdfs-2701.txt, hdfs-2701.txt, hdfs-2701.txt Let's rename the various processIOError methods to be more descriptive. The current code makes it difficult to identify and reason about bug fixes. While we're at it let's remove Fatal from the Unable to sync the edit log log since it's not actually a fatal error (this is confusing to users). And 2NN Checkpoint done should be info, not a warning (also confusing to users). Thanks to HDFS-1073 these issues don't exist on trunk or 23. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2703) removedStorageDirs is not updated everywhere we remove a storage dir
[ https://issues.apache.org/jira/browse/HDFS-2703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172024#comment-13172024 ] Todd Lipcon commented on HDFS-2703: --- +1 removedStorageDirs is not updated everywhere we remove a storage dir Key: HDFS-2703 URL: https://issues.apache.org/jira/browse/HDFS-2703 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-2703.txt There are a number of places (FSEditLog#open, purgeEditLog, and rollEditLog) where we remove a storage directory but don't add it to the removedStorageDirs list. This means a storage dir may have been removed but we don't see it in the log or Web UI. This doesn't affect trunk/23 since the code there is totally different. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2702) A single failed name dir can cause the NN to exit
[ https://issues.apache.org/jira/browse/HDFS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172025#comment-13172025 ] Todd Lipcon commented on HDFS-2702: --- - in {{fatalExit}}, can you change it to: {code} FSNamesystem.LOG.faral(msg, new Exception(msg)); {code} so that we get a stacktrace in the logs? - in {{exitIfNoStreams}} use {{isEmpty}} instead of comparing {{size() == 0}} - rather than an {{if...throw AssertionError}} maybe just use the {{Preconditions.checkState}} function from guava? Or is guava not in branch-1 yet? (can't remember) - instead of calling {{exitIfNoStreams}} everywhere, maybe {{removeEditsForStorageDir}} can just call it whenever it removes one? Otherwise looks good. A single failed name dir can cause the NN to exit -- Key: HDFS-2702 URL: https://issues.apache.org/jira/browse/HDFS-2702 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 1.0.0 Reporter: Eli Collins Assignee: Eli Collins Priority: Critical Attachments: hdfs-2702.txt, hdfs-2702.txt, hdfs-2702.txt There's a bug in FSEditLog#rollEditLog which results in the NN process exiting if a single name dir has failed. Here's the relevant code: {code} close() // So editStreams.size() is 0 foreach edits dir { .. eStream = new ... // Might get an IOE here editStreams.add(eStream); } catch (IOException ioe) { removeEditsForStorageDir(sd); // exits if editStreams.size() = 1 } {code} If we get an IOException before we've added two edits streams to the list we'll exit, eg if there's an error processing the 1st name dir we'll exit even if there are 4 valid name dirs. The fix is to move the checking out of removeEditsForStorageDir (nee processIOError) or modify it so it can be disabled in some cases, eg here where we don't yet know how many streams are valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2679) Add interface to query current state to HAServiceProtocol
[ https://issues.apache.org/jira/browse/HDFS-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172032#comment-13172032 ] Todd Lipcon commented on HDFS-2679: --- +1. I'll commit this momentarily Add interface to query current state to HAServiceProtocol -- Key: HDFS-2679 URL: https://issues.apache.org/jira/browse/HDFS-2679 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HA branch (HDFS-1623) Reporter: Eli Collins Assignee: Eli Collins Attachments: hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt Let's add an interface to HAServiceProtocol to query the current state of a NameNode for use by the the CLI (HAAdmin) and Web UI (HDFS-2677). This essentially makes the names active and standby from ACTIVE_STATE and STANDBY_STATE public interfaces, which IMO seems reasonable. Unlike the other APIs we should be able to use the interface even when HA is not enabled (as by default a non-HA NN is active). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-2679) Add interface to query current state to HAServiceProtocol
[ https://issues.apache.org/jira/browse/HDFS-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-2679. --- Resolution: Fixed Fix Version/s: HA branch (HDFS-1623) Hadoop Flags: Reviewed Add interface to query current state to HAServiceProtocol -- Key: HDFS-2679 URL: https://issues.apache.org/jira/browse/HDFS-2679 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HA branch (HDFS-1623) Reporter: Eli Collins Assignee: Eli Collins Fix For: HA branch (HDFS-1623) Attachments: hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt, hdfs-2679.txt Let's add an interface to HAServiceProtocol to query the current state of a NameNode for use by the the CLI (HAAdmin) and Web UI (HDFS-2677). This essentially makes the names active and standby from ACTIVE_STATE and STANDBY_STATE public interfaces, which IMO seems reasonable. Unlike the other APIs we should be able to use the interface even when HA is not enabled (as by default a non-HA NN is active). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2677) HA: Web UI should indicate the NN state
[ https://issues.apache.org/jira/browse/HDFS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172034#comment-13172034 ] Todd Lipcon commented on HDFS-2677: --- +1, will commit momentarily HA: Web UI should indicate the NN state --- Key: HDFS-2677 URL: https://issues.apache.org/jira/browse/HDFS-2677 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: HA branch (HDFS-1623) Reporter: Eli Collins Assignee: Eli Collins Fix For: HA branch (HDFS-1623) Attachments: hdfs-2677.txt, hdfs-2677.txt, hdfs-2677.txt, hdfs-2677.txt The DFS web UI should indicate whether it's an active or standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-2677) HA: Web UI should indicate the NN state
[ https://issues.apache.org/jira/browse/HDFS-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-2677. --- Resolution: Fixed Fix Version/s: HA branch (HDFS-1623) Hadoop Flags: Reviewed HA: Web UI should indicate the NN state --- Key: HDFS-2677 URL: https://issues.apache.org/jira/browse/HDFS-2677 Project: Hadoop HDFS Issue Type: Sub-task Components: ha Affects Versions: HA branch (HDFS-1623) Reporter: Eli Collins Assignee: Eli Collins Fix For: HA branch (HDFS-1623) Attachments: hdfs-2677.txt, hdfs-2677.txt, hdfs-2677.txt, hdfs-2677.txt The DFS web UI should indicate whether it's an active or standby. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-1108) Log newly allocated blocks
[ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-1108. --- Resolution: Duplicate Resolving this one as duplicate since it got incorporated into HDFS-2602 Log newly allocated blocks -- Key: HDFS-1108 URL: https://issues.apache.org/jira/browse/HDFS-1108 Project: Hadoop HDFS Issue Type: Sub-task Components: name-node Reporter: dhruba borthakur Assignee: Todd Lipcon Fix For: HA branch (HDFS-1623) Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, hdfs-1108-habranch.txt, hdfs-1108.txt The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on the file persists the blocks into the transaction log. It would be nice if we can immediately persist newly allocated blocks (as soon as they are allocated) for specific files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2291) HA: Checkpointing in an HA setup
[ https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-2291: -- Component/s: ha HA: Checkpointing in an HA setup Key: HDFS-2291 URL: https://issues.apache.org/jira/browse/HDFS-2291 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: HA branch (HDFS-1623) Reporter: Aaron T. Myers Assignee: Todd Lipcon Fix For: HA branch (HDFS-1623) We obviously need to create checkpoints when HA is enabled. One thought is to use a third, dedicated checkpointing node in addition to the active and standby nodes. Another option would be to make the standby capable of also performing the function of checkpointing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2394) Add tests for Namenode active standby states
[ https://issues.apache.org/jira/browse/HDFS-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172038#comment-13172038 ] Todd Lipcon commented on HDFS-2394: --- Hey Suresh, I think this issue has been superceded some of the other tests that have recently gone into branch. Would you agree, or do you have some tests you're planning to contribute under this JIRA? Add tests for Namenode active standby states Key: HDFS-2394 URL: https://issues.apache.org/jira/browse/HDFS-2394 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node, test Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2394) Add tests for Namenode active standby states
[ https://issues.apache.org/jira/browse/HDFS-2394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-2394: -- Component/s: test ha Add tests for Namenode active standby states Key: HDFS-2394 URL: https://issues.apache.org/jira/browse/HDFS-2394 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node, test Affects Versions: 0.24.0 Reporter: Suresh Srinivas Assignee: Suresh Srinivas -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2704) NameNodeResouceChecker#checkAvailableResources should check for inodes
[ https://issues.apache.org/jira/browse/HDFS-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172046#comment-13172046 ] SreeHari commented on HDFS-2704: Will give a patch for this . NameNodeResouceChecker#checkAvailableResources should check for inodes -- Key: HDFS-2704 URL: https://issues.apache.org/jira/browse/HDFS-2704 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.24.0 Reporter: Eli Collins NameNodeResouceChecker#checkAvailableResources currently just checks for free space. However inodes are also a file system resource that needs to be available (you can run out of inodes but have free space). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-2704) NameNodeResouceChecker#checkAvailableResources should check for inodes
[ https://issues.apache.org/jira/browse/HDFS-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] SreeHari reassigned HDFS-2704: -- Assignee: SreeHari NameNodeResouceChecker#checkAvailableResources should check for inodes -- Key: HDFS-2704 URL: https://issues.apache.org/jira/browse/HDFS-2704 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.24.0 Reporter: Eli Collins Assignee: SreeHari NameNodeResouceChecker#checkAvailableResources currently just checks for free space. However inodes are also a file system resource that needs to be available (you can run out of inodes but have free space). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2699) Store data and checksums together in block file
[ https://issues.apache.org/jira/browse/HDFS-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13172059#comment-13172059 ] M. C. Srivas commented on HDFS-2699: @Todd: no one is arguing that putting the CRC inline is not beneficial wrt seek time. Recalculating CRC over with a 4K block is substantially slower than with a 512-byte block (256 bytes vs 2K on the average is a 10x factor). Imagine appending continuously to the HBase WAL with the 128-byte records that you mentioned in another thread ... the CPU burn will be much worse with 4K CRC blocks. Secondly, the disk manufacturers guarantee only a 512-byte atomicity on disk. Linux doing a 4K block write guarantees almost nothing wrt atomicity of that 4K write to disk. On a crash, unless you are running some sort of RAID or data-journal, there is a likelihood of the 4K block that's in-flight getting corrupted. Store data and checksums together in block file --- Key: HDFS-2699 URL: https://issues.apache.org/jira/browse/HDFS-2699 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a major problem for scaling HBase, because HBase is usually bottlenecked on the number of random disk iops that the storage-hardware offers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira