[jira] Created: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read
SocketTimeoutException: timeout while waiting for channel to be ready for read -- Key: HDFS-770 URL: https://issues.apache.org/jira/browse/HDFS-770 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs, data-node, hdfs client, name-node Affects Versions: 0.20.1 Environment: Ubuntu Linux 8.04 Reporter: Leon Mergen Attachments: client.txt, datanode.txt, namenode.txt We're having issues with timeouts occurring in our client: for some reason, a timeout of 63000 milliseconds is triggered while writing HDFS data. Since we currently have a single-server setup, this results in our client terminating with a All datanodes are bad IOException. We're running all services, including the client, on our single server, so it cannot be a network error. The load on the client is extremely low during this period: only a few kilobytes a minute were being written around the time the error occured. After browsing a bit online, a lot of people talk about setting dfs.datanode.socket.write.timeout to 0 as a solution for this problem. Due to the low load of our system during this period, however, I do feel this is a real error and a timeout that should not be occurring. I have attached 3 logs of the namenode, datanode and client. It could be that this is related to http://issues.apache.org/jira/browse/HDFS-693 Any pointers on how I can assist to resolve this issue will be greatly appreciated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read
[ https://issues.apache.org/jira/browse/HDFS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Leon Mergen updated HDFS-770: - Attachment: client.txt namenode.txt datanode.txt SocketTimeoutException: timeout while waiting for channel to be ready for read -- Key: HDFS-770 URL: https://issues.apache.org/jira/browse/HDFS-770 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs, data-node, hdfs client, name-node Affects Versions: 0.20.1 Environment: Ubuntu Linux 8.04 Reporter: Leon Mergen Attachments: client.txt, datanode.txt, namenode.txt We're having issues with timeouts occurring in our client: for some reason, a timeout of 63000 milliseconds is triggered while writing HDFS data. Since we currently have a single-server setup, this results in our client terminating with a All datanodes are bad IOException. We're running all services, including the client, on our single server, so it cannot be a network error. The load on the client is extremely low during this period: only a few kilobytes a minute were being written around the time the error occured. After browsing a bit online, a lot of people talk about setting dfs.datanode.socket.write.timeout to 0 as a solution for this problem. Due to the low load of our system during this period, however, I do feel this is a real error and a timeout that should not be occurring. I have attached 3 logs of the namenode, datanode and client. It could be that this is related to http://issues.apache.org/jira/browse/HDFS-693 Any pointers on how I can assist to resolve this issue will be greatly appreciated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-763: -- Status: Open (was: Patch Available) DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-763: -- Status: Patch Available (was: Open) Trigger HadoopQA tests DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-94) The Heap Size in HDFS web ui may not be accurate
[ https://issues.apache.org/jira/browse/HDFS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777438#action_12777438 ] dhruba borthakur commented on HDFS-94: -- Currently, the code uses {quote} long totalMemory = Runtime.getRuntime().totalMemory(); long maxMemory = Runtime.getRuntime().maxMemory(); long used = (totalMemory * 100)/maxMemory; {quote} Is it better to use : {quote} MemoryMXBean memoryMXBean = ManagementFactory.getMemoryMXBean(); MemoryUsage status = memoryMXBean.getHeapMemoryUsage(); usedMemory = status.getUsed(); maxMemory = status.getMax(); {quote} The Heap Size in HDFS web ui may not be accurate -- Key: HDFS-94 URL: https://issues.apache.org/jira/browse/HDFS-94 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo (Nicholas), SZE It seems that the Heap Size shown in HDFS web UI is not accurate. It keeps showing 100% of usage. e.g. {noformat} Heap Size is 10.01 GB / 10.01 GB (100%) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777441#action_12777441 ] Hadoop QA commented on HDFS-763: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424762/scanErrors.txt against trunk revision 835752. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/109/console This message is automatically generated. DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-763: -- Attachment: scanErrors.txt DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-763: -- Status: Patch Available (was: Open) Trigger HadoopQA. DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-763: -- Status: Open (was: Patch Available) DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777471#action_12777471 ] Hadoop QA commented on HDFS-763: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424830/scanErrors.txt against trunk revision 835752. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/110/console This message is automatically generated. DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-769) test-c++-libhdfs constantly fails
[ https://issues.apache.org/jira/browse/HDFS-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777473#action_12777473 ] dhruba borthakur commented on HDFS-769: --- I will try to take a look at this one. test-c++-libhdfs constantly fails - Key: HDFS-769 URL: https://issues.apache.org/jira/browse/HDFS-769 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Reporter: Konstantin Boudnik Execution of {{test-c++-libhdfs}} always fails. Running {noformat} % ant test-c++-libhdfs -Dcompile.c++=yes -Dlibhdfs=yes {noformat} fails with the following diagnostic: {noformat} test-c++-libhdfs: [mkdir] Created dir: /homes/xxx/work/Hdfs.trunk/build/test/libhdfs ... [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh [exec] [exec] LIB_JVM_DIR = /usr/java/latest/jre/lib/i386/server [exec] [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh: line 118: /homes/xxx/work/Hdfs.trunk/bin/hadoop: No such file or directory [exec] CLASSPATH=/homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/conf:/homes/xxx/work/Hdfs.trunk/conf:/homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/conf:/homes/cot [exec] Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration [exec] Can't construct instance of class org.apache.hadoop.conf.Configuration [exec] Oops! Failed to connect to hdfs! [exec] exiting with 255 [exec] /homes/xxx/work/Hdfs.trunk/src/c++/libhdfs/tests/test-libhdfs.sh: line 126: /homes/xxx/work/Hdfs.trunk/bin/hadoop-daemon.sh: No such file or directory [exec] make: *** [test] Error 255 {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.
[ https://issues.apache.org/jira/browse/HDFS-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777500#action_12777500 ] Cosmin Lehene commented on HDFS-630: stack: I can't reproduce it on 0.21. I did find it in the NN log before upgrading the HBase jar to the patched hdfs. java.io.IOException: Cannot complete block: block has not been COMMITTED by the client at org.apache.hadoop.hdfs.server.namenode.BlockInfoUnderConstruction.convertToCompleteBlock(BlockInfoUnderConstruction.java:158) at org.apache.hadoop.hdfs.server.namenode.BlockManager.completeBlock(BlockManager.java:288) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1243) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:637) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621) at sun.reflect.GeneratedMethodAccessor48.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:516) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:964) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:960) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:958) I should point that at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:621) line 621 in the NameNode means it was called from an unpached DFSClient that calls the old NameNode interface line 621: return addBlock(src, clientName, null, null); This is part of public LocatedBlock addBlock(String src, String clientName, Block previous) @Override public LocatedBlock addBlock(String src, String clientName, Block previous) throws IOException { return addBlock(src, clientName, null, null); } This is different than your stacktrace http://pastie.org/695936 that calls the complete() method. However could you search for the same error while adding a new block with addBlock() (like mine)? If you find it, you could figure out what's the entry point in NameNode, and if it's line 621 you might have a an unpatched DFSClient. However, even with an unpatched DFSClient I still fail, yet, to figure out why would it cause it. Perhaps I should get a better understanding of the cause of the exception. So far, from the code comments in BlockInfoUnderConstruction I have that the state of the block (the generation stamp and the length) has not been committed by the client or it does not have at least a minimal number of replicas reported from data-nodes. In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block. --- Key: HDFS-630 URL: https://issues.apache.org/jira/browse/HDFS-630 Project: Hadoop HDFS Issue Type: New Feature Components: hdfs client Affects Versions: 0.21.0 Reporter: Ruyue Ma Assignee: Ruyue Ma Priority: Minor Fix For: 0.21.0 Attachments: 0001-Fix-HDFS-630-for-0.21.patch, HDFS-630.patch created from hdfs-200. If during a write, the dfsclient sees that a block replica location for a newly allocated block is not-connectable, it re-requests the NN to get a fresh set of replica locations of the block. It tries this dfs.client.block.write.retries times (default 3), sleeping 6 seconds between each retry ( see DFSClient.nextBlockOutputStream). This setting works well when you have a reasonable size cluster; if u have few datanodes in the cluster, every retry maybe pick the dead-datanode and the above logic bails out. Our solution: when getting block location from namenode, we give nn the excluded datanodes. The list of dead datanodes is only for one block allocation. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode
Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode Key: HDFS-771 URL: https://issues.apache.org/jira/browse/HDFS-771 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Environment: Apache Hudson build machine Reporter: Konstantin Boudnik In an execution of a tests the following exception has been thrown: {noformat} Error Message port out of range:-1 Stacktrace java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.init(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371) at org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode
[ https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated HDFS-771: Attachment: testEditLog.html Full log of the test execution Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode Key: HDFS-771 URL: https://issues.apache.org/jira/browse/HDFS-771 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Environment: Apache Hudson build machine Reporter: Konstantin Boudnik Attachments: testEditLog.html In an execution of a tests the following exception has been thrown: {noformat} Error Message port out of range:-1 Stacktrace java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.init(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371) at org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-771) Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode
[ https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777544#action_12777544 ] Konstantin Boudnik commented on HDFS-771: - The environment and all are available from [Hudson build|http://hudson.zones.apache.org/hudson/view/Hadoop/job/Hadoop-Hdfs-trunk-Commit/109/testReport/org.apache.hadoop.hdfs.server.namenode/TestEditLog/testEditLog/] Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode Key: HDFS-771 URL: https://issues.apache.org/jira/browse/HDFS-771 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.22.0 Environment: Apache Hudson build machine Reporter: Konstantin Boudnik Attachments: testEditLog.html In an execution of a tests the following exception has been thrown: {noformat} Error Message port out of range:-1 Stacktrace java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.init(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371) at org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777605#action_12777605 ] Raghu Angadi commented on HDFS-763: --- +1. totalErrors shown on 'blockScannerReport' now becomes same as number of verification failures, rather than all the errors seen. DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-733) TestBlockReport fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777610#action_12777610 ] Konstantin Boudnik commented on HDFS-733: - I've ran the patched test on the Hudson hardware a few times and everything seems to be all right - no failures are seeing. I'm going to commit this shortly. TestBlockReport fails intermittently Key: HDFS-733 URL: https://issues.apache.org/jira/browse/HDFS-733 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Suresh Srinivas Assignee: Konstantin Boudnik Fix For: 0.21.0, 0.22.0 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch Details at http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-770) SocketTimeoutException: timeout while waiting for channel to be ready for read
[ https://issues.apache.org/jira/browse/HDFS-770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777612#action_12777612 ] Raghu Angadi commented on HDFS-770: --- From the datanode log : 2009-11-13 06:18:21,965 DEBUG org.apache.hadoop.ipc.RPC: Call: sendHeartbeat 14 2009-11-13 06:19:38,081 DEBUG org.apache.hadoop.ipc.Client: IPC Client (47) connection to dfs.hadoop.tsukku.solatis/127.0.0.1:9000 from hadoop: closed Note that there is no activity on DataNode for 77 seconds. There are number of possibilities, common one being GC. we haven't seen GC taking this long DN though. Assuming DN went to sleep for some reason, rest of the behaviour is expected. If you do expect such delays, what you need to increase is the read timeout for responder thread in DFSOutputStream (there is a config for generic read timeout that applies to sockets in many contexts). SocketTimeoutException: timeout while waiting for channel to be ready for read -- Key: HDFS-770 URL: https://issues.apache.org/jira/browse/HDFS-770 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs, data-node, hdfs client, name-node Affects Versions: 0.20.1 Environment: Ubuntu Linux 8.04 Reporter: Leon Mergen Attachments: client.txt, datanode.txt, namenode.txt We're having issues with timeouts occurring in our client: for some reason, a timeout of 63000 milliseconds is triggered while writing HDFS data. Since we currently have a single-server setup, this results in our client terminating with a All datanodes are bad IOException. We're running all services, including the client, on our single server, so it cannot be a network error. The load on the client is extremely low during this period: only a few kilobytes a minute were being written around the time the error occured. After browsing a bit online, a lot of people talk about setting dfs.datanode.socket.write.timeout to 0 as a solution for this problem. Due to the low load of our system during this period, however, I do feel this is a real error and a timeout that should not be occurring. I have attached 3 logs of the namenode, datanode and client. It could be that this is related to http://issues.apache.org/jira/browse/HDFS-693 Any pointers on how I can assist to resolve this issue will be greatly appreciated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-733) TestBlockReport fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated HDFS-733: Resolution: Fixed Status: Resolved (was: Patch Available) The fix is committed to the trunk and to the branch 0.21 TestBlockReport fails intermittently Key: HDFS-733 URL: https://issues.apache.org/jira/browse/HDFS-733 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Suresh Srinivas Assignee: Konstantin Boudnik Fix For: 0.21.0, 0.22.0 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch Details at http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-733) TestBlockReport fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777627#action_12777627 ] Hudson commented on HDFS-733: - Integrated in Hadoop-Hdfs-trunk-Commit #110 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/110/]) . TestBlockReport fails intermittently (cos) TestBlockReport fails intermittently Key: HDFS-733 URL: https://issues.apache.org/jira/browse/HDFS-733 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: Suresh Srinivas Assignee: Konstantin Boudnik Fix For: 0.21.0, 0.22.0 Attachments: HDFS-733.2.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch, HDFS-733.patch Details at http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/58/testReport/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-767) Job failure due to BlockMissingException
[ https://issues.apache.org/jira/browse/HDFS-767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777635#action_12777635 ] Todd Lipcon commented on HDFS-767: -- Hi Ning, Sounds good - your formula seems to make sense. If you can add a few lines of comments around the formula (or a pointer to this JIRA) I think that would be helpful to make sure people looking at the code down the line will understand where it came from. Additionally, I think making the 3000 parameter a configuration variable (even if an undocumented one) would be swell. Job failure due to BlockMissingException Key: HDFS-767 URL: https://issues.apache.org/jira/browse/HDFS-767 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ning Zhang If a block is request by too many mappers/reducers (say, 3000) at the same time, a BlockMissingException is thrown because it exceeds the upper limit (I think 256 by default) of number of threads accessing the same block at the same time. The DFSClient wil catch that exception and retry 3 times after waiting for 3 seconds. Since the wait time is a fixed value, a lot of clients will retry at about the same time and a large portion of them get another failure. After 3 retries, there are about 256*4 = 1024 clients got the block. If the number of clients are more than that, the job will fail. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce
[ https://issues.apache.org/jira/browse/HDFS-641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777647#action_12777647 ] Hudson commented on HDFS-641: - Integrated in Hadoop-Mapreduce-trunk-Commit #118 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/118/]) . Move all of the components that depend on map/reduce to map/reduce. (omalley) Move all of the benchmarks and tests that depend on mapreduce to mapreduce -- Key: HDFS-641 URL: https://issues.apache.org/jira/browse/HDFS-641 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.2 Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.21.0 Currently, we have a bad cycle where to build hdfs you need to test mapreduce and iterate once. This is broken. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-706) Intermittent failures in TestFiHFlush
[ https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-706: Assignee: Konstantin Boudnik Hadoop Flags: [Reviewed] Status: Patch Available (was: Open) +1 patch looks good. Intermittent failures in TestFiHFlush - Key: HDFS-706 URL: https://issues.apache.org/jira/browse/HDFS-706 Project: Hadoop HDFS Issue Type: Bug Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Attachments: HDFS-706.patch, TEST-org.apache.hadoop.hdfs.TestHFlush.txt Running tests on a Linux box I've started seeing intermittent failures among TestFiHFlush test cases. It turns out that occasional failures are observed on my laptop running BSD -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading
[ https://issues.apache.org/jira/browse/HDFS-763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1215#action_1215 ] Raghu Angadi commented on HDFS-763: --- I don't think this needs an extra unit test. That stat affected here is only for display purposes and also not related to stats reported to stats servers like simon. DataBlockScanner reporting of bad blocks is slightly misleading --- Key: HDFS-763 URL: https://issues.apache.org/jira/browse/HDFS-763 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: scanErrors.txt, scanErrors.txt, scanErrors.txt The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors since restart : 33 Transient scan errors : 0 The statement saying that there were 33 errors is slightly midleading because these are not crc mismatches, rather the block was being deleted when the crc verification was about to happen. I propose that DataBlockScanner.totalScanErrors is not updated if the dataset.getFile(block) is null, i.e. the block is now deleted from the datanode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-94) The Heap Size in HDFS web ui may not be accurate
[ https://issues.apache.org/jira/browse/HDFS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1222#action_1222 ] Tsz Wo (Nicholas), SZE commented on HDFS-94: It makes to replace the current codes by MemoryMXBean since it provides more information. I think it is better to show more numbers like non-heap usage, init, used, committed, max, etc. The Heap Size in HDFS web ui may not be accurate -- Key: HDFS-94 URL: https://issues.apache.org/jira/browse/HDFS-94 Project: Hadoop HDFS Issue Type: Bug Reporter: Tsz Wo (Nicholas), SZE It seems that the Heap Size shown in HDFS web UI is not accurate. It keeps showing 100% of usage. e.g. {noformat} Heap Size is 10.01 GB / 10.01 GB (100%) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-706) Intermittent failures in TestFiHFlush
[ https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1224#action_1224 ] Hadoop QA commented on HDFS-706: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12424811/HDFS-706.patch against trunk revision 835958. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 9 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/111/console This message is automatically generated. Intermittent failures in TestFiHFlush - Key: HDFS-706 URL: https://issues.apache.org/jira/browse/HDFS-706 Project: Hadoop HDFS Issue Type: Bug Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Attachments: HDFS-706.patch, HDFS-706.patch, TEST-org.apache.hadoop.hdfs.TestHFlush.txt Running tests on a Linux box I've started seeing intermittent failures among TestFiHFlush test cases. It turns out that occasional failures are observed on my laptop running BSD -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-706) Intermittent failures in TestFiHFlush
[ https://issues.apache.org/jira/browse/HDFS-706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1235#action_1235 ] Konstantin Boudnik commented on HDFS-706: - The test failure is irrelevant Intermittent failures in TestFiHFlush - Key: HDFS-706 URL: https://issues.apache.org/jira/browse/HDFS-706 Project: Hadoop HDFS Issue Type: Bug Reporter: Konstantin Boudnik Assignee: Konstantin Boudnik Attachments: HDFS-706.patch, HDFS-706.patch, TEST-org.apache.hadoop.hdfs.TestHFlush.txt Running tests on a Linux box I've started seeing intermittent failures among TestFiHFlush test cases. It turns out that occasional failures are observed on my laptop running BSD -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-772) DFSClient.getFileChecksum(..) computes file md5 with extra padding
[ https://issues.apache.org/jira/browse/HDFS-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-772: Attachment: h772_20091113.patch h772_20091113.patch: use md5out.getLength() to limit the data. DFSClient.getFileChecksum(..) computes file md5 with extra padding -- Key: HDFS-772 URL: https://issues.apache.org/jira/browse/HDFS-772 Project: Hadoop HDFS Issue Type: Bug Components: hdfs client Affects Versions: 0.20.1 Reporter: Tsz Wo (Nicholas), SZE Attachments: h772_20091113.patch {code} //DFSClient.getFileChecksum(..) final MD5Hash fileMD5 = MD5Hash.digest(md5out.getData()); {code} The fileMD5 is computed with the entire byte array returning by md5out.getData(). However, data are valid only up to md5out.getLength(). Therefore, the currently implementation of the algorithm compute fileMD5 with extra padding. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-758) Improve reporting of progress of decommissioning
[ https://issues.apache.org/jira/browse/HDFS-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-758: -- Attachment: HDFS-758.1.patch Improve reporting of progress of decommissioning Key: HDFS-758 URL: https://issues.apache.org/jira/browse/HDFS-758 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jitendra Nath Pandey Attachments: HDFS-758.1.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-641) Move all of the benchmarks and tests that depend on mapreduce to mapreduce
[ https://issues.apache.org/jira/browse/HDFS-641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HDFS-641. Resolution: Fixed I just committed this. Move all of the benchmarks and tests that depend on mapreduce to mapreduce -- Key: HDFS-641 URL: https://issues.apache.org/jira/browse/HDFS-641 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.20.2 Reporter: Owen O'Malley Assignee: Owen O'Malley Priority: Blocker Fix For: 0.21.0 Currently, we have a bad cycle where to build hdfs you need to test mapreduce and iterate once. This is broken. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-596) Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup
[ https://issues.apache.org/jira/browse/HDFS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1265#action_1265 ] Christian Kunz commented on HDFS-596: - We just hit this bug big time. Our applications ran out of memory. We will have to apply this patch ourselves. Why was this issue not declared as blocker? This is a bserious/b memory issue introduced between hadoop-0.18 and hadoop-0.20. Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup Key: HDFS-596 URL: https://issues.apache.org/jira/browse/HDFS-596 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Affects Versions: 0.20.1 Environment: Linux hadoop-001 2.6.28-14-server #47-Ubuntu SMP Sat Jul 25 01:18:34 UTC 2009 i686 GNU/Linux. Namenode with 1GB memory. Reporter: Zhang Bingjun Priority: Critical Fix For: 0.20.2 Attachments: HDFS-596.patch Original Estimate: 0.5h Remaining Estimate: 0.5h This bugs affects fuse-dfs severely. In my test, about 1GB memory were exhausted and the fuse-dfs mount directory was disconnected after writing 14000 files. This bug is related to the memory leak problem of this issue: http://issues.apache.org/jira/browse/HDFS-420. The bug can be fixed very easily. In function hdfsFreeFileInfo() in file hdfs.c (under c++/libhdfs/) change code block: //Free the mName int i; for (i=0; i numEntries; ++i) { if (hdfsFileInfo[i].mName) { free(hdfsFileInfo[i].mName); } } into: // free mName, mOwner and mGroup int i; for (i=0; i numEntries; ++i) { if (hdfsFileInfo[i].mName) { free(hdfsFileInfo[i].mName); } if (hdfsFileInfo[i].mOwner){ free(hdfsFileInfo[i].mOwner); } if (hdfsFileInfo[i].mGroup){ free(hdfsFileInfo[i].mGroup); } } I am new to Jira and haven't figured out a way to generate .patch file yet. Could anyone help me do that so that others can commit the changes into the code base. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-596) Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup
[ https://issues.apache.org/jira/browse/HDFS-596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Christian Kunz updated HDFS-596: Priority: Blocker (was: Critical) Memory leak in libhdfs: hdfsFreeFileInfo() in libhdfs does not free memory for mOwner and mGroup Key: HDFS-596 URL: https://issues.apache.org/jira/browse/HDFS-596 Project: Hadoop HDFS Issue Type: Bug Components: contrib/fuse-dfs Affects Versions: 0.20.1 Environment: Linux hadoop-001 2.6.28-14-server #47-Ubuntu SMP Sat Jul 25 01:18:34 UTC 2009 i686 GNU/Linux. Namenode with 1GB memory. Reporter: Zhang Bingjun Priority: Blocker Fix For: 0.20.2 Attachments: HDFS-596.patch Original Estimate: 0.5h Remaining Estimate: 0.5h This bugs affects fuse-dfs severely. In my test, about 1GB memory were exhausted and the fuse-dfs mount directory was disconnected after writing 14000 files. This bug is related to the memory leak problem of this issue: http://issues.apache.org/jira/browse/HDFS-420. The bug can be fixed very easily. In function hdfsFreeFileInfo() in file hdfs.c (under c++/libhdfs/) change code block: //Free the mName int i; for (i=0; i numEntries; ++i) { if (hdfsFileInfo[i].mName) { free(hdfsFileInfo[i].mName); } } into: // free mName, mOwner and mGroup int i; for (i=0; i numEntries; ++i) { if (hdfsFileInfo[i].mName) { free(hdfsFileInfo[i].mName); } if (hdfsFileInfo[i].mOwner){ free(hdfsFileInfo[i].mOwner); } if (hdfsFileInfo[i].mGroup){ free(hdfsFileInfo[i].mGroup); } } I am new to Jira and haven't figured out a way to generate .patch file yet. Could anyone help me do that so that others can commit the changes into the code base. Thanks! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-758) Improve reporting of progress of decommissioning
[ https://issues.apache.org/jira/browse/HDFS-758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-758: -- Attachment: HDFS-758.2.patch Improve reporting of progress of decommissioning Key: HDFS-758 URL: https://issues.apache.org/jira/browse/HDFS-758 Project: Hadoop HDFS Issue Type: Improvement Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HDFS-758.1.patch, HDFS-758.2.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-771) NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown
[ https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Boudnik updated HDFS-771: Component/s: (was: test) name-node Priority: Blocker (was: Major) Tags: regression Summary: NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown (was: Jetty crashes: MiniDFSCluster supplies incorrect port number to the NameNode) This issue seems to be a regression (or rather the result of incomplete fix of HADOOP-4744) The problem is surfacing quite often in Pig (a few times per week). So, I'm raising the priority to the Blocker, cause all components are affected by this issue. NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown - Key: HDFS-771 URL: https://issues.apache.org/jira/browse/HDFS-771 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Environment: Apache Hudson build machine Reporter: Konstantin Boudnik Priority: Blocker Attachments: testEditLog.html In an execution of a tests the following exception has been thrown: {noformat} Error Message port out of range:-1 Stacktrace java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.init(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371) at org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-771) NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown
[ https://issues.apache.org/jira/browse/HDFS-771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777806#action_12777806 ] Konstantin Boudnik commented on HDFS-771: - After all it seems to be a race condition in the Jetty, e.g. (NameNode:367) {noformat} this.httpServer.start(); {noformat} Appropriate log file {noformat} 2009-11-13 07:02:04,605 INFO http.HttpServer (HttpServer.java:start(432)) - Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 0 2009-11-13 07:02:04,606 INFO http.HttpServer (HttpServer.java:start(437)) - listener.getLocalPort() returned 37817 webServer.getConnectors()[0].getLocalPort() returned 37817 2009-11-13 07:02:04,607 INFO http.HttpServer (HttpServer.java:start(470)) - Jetty bound to port 37817 2009-11-13 07:02:04,607 INFO mortbay.log (?:invoke0(?)) - jetty-6.1.14 2009-11-13 07:03:04,231 INFO mortbay.log (?:invoke0(?)) - Started selectchannelconnec...@localhost:37817 {noformat} And the this code is executed (NameNode:370-371) {noformat} // The web-server port can be ephemeral... ensure we have the correct info infoPort = this.httpServer.getPort(); this.httpAddress = new InetSocketAddress(infoHost, infoPort); {noformat} and {{this.httpServer.getPort();}} returns -1 as the infoPort value I'll try to work out a minimal test case to reproduce this problem, however it might be hard. NameNode's HttpServer can't instantiate InetSocketAddress: IllegalArgumentException is thrown - Key: HDFS-771 URL: https://issues.apache.org/jira/browse/HDFS-771 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Environment: Apache Hudson build machine Reporter: Konstantin Boudnik Priority: Blocker Attachments: testEditLog.html In an execution of a tests the following exception has been thrown: {noformat} Error Message port out of range:-1 Stacktrace java.lang.IllegalArgumentException: port out of range:-1 at java.net.InetSocketAddress.init(InetSocketAddress.java:118) at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:371) at org.apache.hadoop.hdfs.server.namenode.NameNode.activate(NameNode.java:313) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:304) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:410) at org.apache.hadoop.hdfs.server.namenode.NameNode.init(NameNode.java:404) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1211) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:287) at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:131) at org.apache.hadoop.hdfs.server.namenode.TestEditLog.testEditLog(TestEditLog.java:92) {noformat} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-755) Read multiple checksum chunks at once in DFSInputStream
[ https://issues.apache.org/jira/browse/HDFS-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777815#action_12777815 ] Eli Collins commented on HDFS-755: -- Hey Todd -- patch looks great. Did you test w/o checksums enabled? Read multiple checksum chunks at once in DFSInputStream --- Key: HDFS-755 URL: https://issues.apache.org/jira/browse/HDFS-755 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-755.txt, hdfs-755.txt, hdfs-755.txt, hdfs-755.txt HADOOP-3205 adds the ability for FSInputChecker subclasses to read multiple checksum chunks in a single call to readChunk. This is the HDFS-side use of that new feature. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-718) configuration parameter to prevent accidental formatting of HDFS filesystem
[ https://issues.apache.org/jira/browse/HDFS-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777827#action_12777827 ] dhruba borthakur commented on HDFS-718: --- Thanks Nicholas, Todd and Allen for the comments. Todd: The idea of the proposed configuration to is to ensure that *no* scripts can format thisnamenode, however hard it may try. The main purpose is to not add the -y option. This is for the paranoid adminstrator who, for sure, never wants *any* scripts to format this namenode. configuration parameter to prevent accidental formatting of HDFS filesystem --- Key: HDFS-718 URL: https://issues.apache.org/jira/browse/HDFS-718 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Environment: Any Reporter: Andrew Ryan Assignee: Andrew Ryan Priority: Minor Attachments: HDFS-718.patch.txt Currently, any time the NameNode is not running, an HDFS filesystem will accept the 'format' command, and will duly format itself. There are those of us who have multi-PB HDFS filesystems who are really quite uncomfortable with this behavior. There is Y/N confirmation in the format command, but if the formatter genuinely believes themselves to be doing the right thing, the filesystem will be formatted. This patch adds a configuration parameter to the namenode, dfs.namenode.support.allowformat, which defaults to true, the current behavior: always allow formatting if the NameNode is down or some other process is not holding the namenode lock. But if dfs.namenode.support.allowformat is set to false, the NameNode will not allow itself to be formatted until this config parameter is changed to true. The general idea is that for production HDFS filesystems, the user would format the HDFS once, then set dfs.namenode.support.allowformat to false for all time. The attached patch was generated against trunk and +1's on my test machine. We have a 0.20 version that we are using in our cluster as well. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-756) libhdfs unit tests do not run
[ https://issues.apache.org/jira/browse/HDFS-756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12777833#action_12777833 ] Konstantin Boudnik commented on HDFS-756: - I'd suggest to raise the priority on this, because it makes full build {{ant test}} to fail all the time. libhdfs unit tests do not run -- Key: HDFS-756 URL: https://issues.apache.org/jira/browse/HDFS-756 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs Reporter: dhruba borthakur Assignee: Eli Collins Fix For: 0.22.0 The libhdfs unit tests (ant test-c++-libhdfs -Dislibhdfs=1) do not run yet because the scripts are in the common subproject, -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.