[jira] Reopened: (HDFS-721) ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be created
[ https://issues.apache.org/jira/browse/HDFS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE reopened HDFS-721: - > ... will I reopen so this issue catches a patch to change log level? Sure. Let's reopen this for changing the log level. > ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be > created > --- > > Key: HDFS-721 > URL: https://issues.apache.org/jira/browse/HDFS-721 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: dfs.support.append=true > Current branch-0.21 of hdfs, mapreduce, and common. Here is svn info: > URL: https://svn.apache.org/repos/asf/hadoop/hdfs/branches/branch-0.21 > Repository Root: https://svn.apache.org/repos/asf > Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 > Revision: 827883 > Node Kind: directory > Schedule: normal > Last Changed Author: szetszwo > Last Changed Rev: 826906 > Last Changed Date: 2009-10-20 00:16:25 + (Tue, 20 Oct 2009) >Reporter: stack > > Running some loading tests against hdfs branch-0.21 I got the following: > {code} > 2009-10-21 04:57:10,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving block blk_6345892463926159834_1030 src: /XX.XX.XX.141:53112 dest: > /XX.XX.XX.140:51010 > 2009-10-21 04:57:10,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > writeBlock blk_6345892463926159834_1030 received exception > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block > blk_6345892463926159834_1030 already exists in state RBW and thus cannot be > created. > 2009-10-21 04:57:10,771 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.140:51010, > storageID=DS-1292310101-XX.XX.XX.140-51010-1256100924816, infoPort=51075, > ipcPort=51020):DataXceiver > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block > blk_6345892463926159834_1030 already exists in state RBW and thus cannot be > created. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.createTemporary(FSDataset.java:1324) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:98) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258) > at > org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:382) > at > org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:323) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111) > at java.lang.Thread.run(Thread.java:619) > {code} > On the sender side: > {code} > 2009-10-21 04:57:10,740 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.141:51010, > storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, > ipcPort=51020) Starting thread to transfer block blk_6345892463926159834_1030 > to XX.XX.XX.140:51010 > 2009-10-21 04:57:10,770 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.141:51010, > storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, > ipcPort=51020):Failed to transfer blk_6345892463926159834_1030 to > XX.XX.XX.140:51010 got java.net.SocketException: Original Exception : > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:199) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:346) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:434) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1262) > at java.lang.Thread.run(Thread.java:619) > Caused by: java.io.IOException: Connection reset by peer > ... 8 more > {code} > The block sequence number, 1030, is one more than that in issue HDFS-720 > (same test run but about 8 seconds between errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-724) Pipeline close hangs if one of the datanode is not responsive.
Pipeline close hangs if one of the datanode is not responsive. -- Key: HDFS-724 URL: https://issues.apache.org/jira/browse/HDFS-724 Project: Hadoop HDFS Issue Type: Bug Components: data-node, hdfs client Reporter: Tsz Wo (Nicholas), SZE In the new pipeline design, pipeline close is implemented by sending an additional empty packet. If one of the datanode does not response to this empty packet, the pipeline hangs. It seems that there is no timeout. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-721) ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be created
[ https://issues.apache.org/jira/browse/HDFS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HDFS-721. Resolution: Invalid Working as designed. Closing. > ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be > created > --- > > Key: HDFS-721 > URL: https://issues.apache.org/jira/browse/HDFS-721 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.21.0 > Environment: dfs.support.append=true > Current branch-0.21 of hdfs, mapreduce, and common. Here is svn info: > URL: https://svn.apache.org/repos/asf/hadoop/hdfs/branches/branch-0.21 > Repository Root: https://svn.apache.org/repos/asf > Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 > Revision: 827883 > Node Kind: directory > Schedule: normal > Last Changed Author: szetszwo > Last Changed Rev: 826906 > Last Changed Date: 2009-10-20 00:16:25 + (Tue, 20 Oct 2009) >Reporter: stack > > Running some loading tests against hdfs branch-0.21 I got the following: > {code} > 2009-10-21 04:57:10,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Receiving block blk_6345892463926159834_1030 src: /XX.XX.XX.141:53112 dest: > /XX.XX.XX.140:51010 > 2009-10-21 04:57:10,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > writeBlock blk_6345892463926159834_1030 received exception > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block > blk_6345892463926159834_1030 already exists in state RBW and thus cannot be > created. > 2009-10-21 04:57:10,771 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.140:51010, > storageID=DS-1292310101-XX.XX.XX.140-51010-1256100924816, infoPort=51075, > ipcPort=51020):DataXceiver > org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block > blk_6345892463926159834_1030 already exists in state RBW and thus cannot be > created. > at > org.apache.hadoop.hdfs.server.datanode.FSDataset.createTemporary(FSDataset.java:1324) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:98) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258) > at > org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:382) > at > org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:323) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111) > at java.lang.Thread.run(Thread.java:619) > {code} > On the sender side: > {code} > 2009-10-21 04:57:10,740 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.141:51010, > storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, > ipcPort=51020) Starting thread to transfer block blk_6345892463926159834_1030 > to XX.XX.XX.140:51010 > 2009-10-21 04:57:10,770 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: > DatanodeRegistration(XX.XX.XX.141:51010, > storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, > ipcPort=51020):Failed to transfer blk_6345892463926159834_1030 to > XX.XX.XX.140:51010 got java.net.SocketException: Original Exception : > java.io.IOException: Connection reset by peer > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) > at > org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:199) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:346) > at > org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:434) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1262) > at java.lang.Thread.run(Thread.java:619) > Caused by: java.io.IOException: Connection reset by peer > ... 8 more > {code} > The block sequence number, 1030, is one more than that in issue HDFS-720 > (same test run but about 8 seconds between errors. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-723) Deadlock in DFSClient#DFSOutputStream
Deadlock in DFSClient#DFSOutputStream - Key: HDFS-723 URL: https://issues.apache.org/jira/browse/HDFS-723 Project: Hadoop HDFS Issue Type: Bug Reporter: Hairong Kuang Assignee: Hairong Kuang Priority: Blocker Fix For: 0.21.0 WhiIe was running some append-related tests, I hit this deadlock: Found one Java-level deadlock: = "Thread-3": waiting to lock monitor 0x00012ee044f0 (object 0x000107a0ded0, a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream), which is held by "main" "main": waiting to lock monitor 0x00012eeb71a8 (object 0x0001082b0748, a org.apache.hadoop.hdfs.DFSClient$LeaseChecker), which is held by "Thread-3" Java stack information for the threads listed above: === "Thread-3": at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3582) - waiting to lock <0x000107a0ded0> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1175) - locked <0x0001082b0748> (a org.apache.hadoop.hdfs.DFSClient$LeaseChecker) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:306) - locked <0x00010824d640> (a org.apache.hadoop.hdfs.DFSClient) - waiting to lock <0x000107a0ded0> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1175) - locked <0x0001082b0748> (a org.apache.hadoop.hdfs.DFSClient$LeaseChecker) at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:306) - locked <0x00010824d640> (a org.apache.hadoop.hdfs.DFSClient) at org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:325) at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1835) - locked <0x000107a77ec8> (a org.apache.hadoop.fs.FileSystem$Cache) at org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:1851) - locked <0x0001079daa00> (a org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer) "main": at org.apache.hadoop.hdfs.DFSClient$LeaseChecker.remove(DFSClient.java:1151) - waiting to lock <0x0001082b0748> (a org.apache.hadoop.hdfs.DFSClient$LeaseChecker) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3609) - locked <0x000107a0ded0> (a org.apache.hadoop.hdfs.DFSClient$DFSOutputStream) at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61) at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86) at org.apache.hadoop.hdfs.TestFileAppend4.testAppend(TestFileAppend4.java:99) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.extensions.TestSetup.run(TestSetup.java:27) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-722) The pointcut callCreateBlockWriteStream in FSDatasetAspects is broken
The pointcut callCreateBlockWriteStream in FSDatasetAspects is broken - Key: HDFS-722 URL: https://issues.apache.org/jira/browse/HDFS-722 Project: Hadoop HDFS Issue Type: Bug Components: test Reporter: Tsz Wo (Nicholas), SZE Fix For: 0.21.0, 0.22.0 HDFS-679 changed the signature of createStreams(). So the callCreateBlockWriteStream pointcut defined in FSDatasetAspects is no longer valid. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.