[jira] Reopened: (HDFS-721) ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be created

2009-10-21 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE reopened HDFS-721:
-


> ... will I reopen so this issue catches a patch to change log level?

Sure.  Let's reopen this for changing the log level.

> ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be 
> created
> ---
>
> Key: HDFS-721
> URL: https://issues.apache.org/jira/browse/HDFS-721
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
> Environment: dfs.support.append=true
> Current branch-0.21 of hdfs, mapreduce, and common. Here is svn info:
> URL: https://svn.apache.org/repos/asf/hadoop/hdfs/branches/branch-0.21
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 827883
> Node Kind: directory
> Schedule: normal
> Last Changed Author: szetszwo
> Last Changed Rev: 826906
> Last Changed Date: 2009-10-20 00:16:25 + (Tue, 20 Oct 2009)
>Reporter: stack
>
> Running some loading tests against hdfs branch-0.21 I got the following:
> {code}
> 2009-10-21 04:57:10,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving block blk_6345892463926159834_1030 src: /XX.XX.XX.141:53112 dest: 
> /XX.XX.XX.140:51010
> 2009-10-21 04:57:10,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> writeBlock blk_6345892463926159834_1030 received exception 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> blk_6345892463926159834_1030 already exists in state RBW and thus cannot be 
> created.
> 2009-10-21 04:57:10,771 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.140:51010, 
> storageID=DS-1292310101-XX.XX.XX.140-51010-1256100924816, infoPort=51075, 
> ipcPort=51020):DataXceiver
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> blk_6345892463926159834_1030 already exists in state RBW and thus cannot be 
> created.
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTemporary(FSDataset.java:1324)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:98)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258)
> at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:382)
> at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:323)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> On the sender side:
> {code}
> 2009-10-21 04:57:10,740 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.141:51010, 
> storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, 
> ipcPort=51020) Starting thread to transfer block blk_6345892463926159834_1030 
> to XX.XX.XX.140:51010
> 2009-10-21 04:57:10,770 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.141:51010, 
> storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, 
> ipcPort=51020):Failed to transfer blk_6345892463926159834_1030 to 
> XX.XX.XX.140:51010 got java.net.SocketException: Original Exception : 
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
> at 
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:199)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:346)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:434)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1262)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Connection reset by peer
> ... 8 more
> {code}
> The block sequence number, 1030, is one more than that in issue HDFS-720 
> (same test run but about 8 seconds between errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-724) Pipeline close hangs if one of the datanode is not responsive.

2009-10-21 Thread Tsz Wo (Nicholas), SZE (JIRA)
Pipeline close hangs if one of the datanode is not responsive.
--

 Key: HDFS-724
 URL: https://issues.apache.org/jira/browse/HDFS-724
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node, hdfs client
Reporter: Tsz Wo (Nicholas), SZE


In the new pipeline design, pipeline close is implemented by sending an 
additional empty packet.  If one of the datanode does not response to this 
empty packet, the pipeline hangs.  It seems that there is no timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HDFS-721) ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be created

2009-10-21 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HDFS-721.


Resolution: Invalid

Working as designed.  Closing.

> ERROR Block blk_XXX_1030 already exists in state RBW and thus cannot be 
> created
> ---
>
> Key: HDFS-721
> URL: https://issues.apache.org/jira/browse/HDFS-721
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.21.0
> Environment: dfs.support.append=true
> Current branch-0.21 of hdfs, mapreduce, and common. Here is svn info:
> URL: https://svn.apache.org/repos/asf/hadoop/hdfs/branches/branch-0.21
> Repository Root: https://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 827883
> Node Kind: directory
> Schedule: normal
> Last Changed Author: szetszwo
> Last Changed Rev: 826906
> Last Changed Date: 2009-10-20 00:16:25 + (Tue, 20 Oct 2009)
>Reporter: stack
>
> Running some loading tests against hdfs branch-0.21 I got the following:
> {code}
> 2009-10-21 04:57:10,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving block blk_6345892463926159834_1030 src: /XX.XX.XX.141:53112 dest: 
> /XX.XX.XX.140:51010
> 2009-10-21 04:57:10,771 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> writeBlock blk_6345892463926159834_1030 received exception 
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> blk_6345892463926159834_1030 already exists in state RBW and thus cannot be 
> created.
> 2009-10-21 04:57:10,771 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.140:51010, 
> storageID=DS-1292310101-XX.XX.XX.140-51010-1256100924816, infoPort=51075, 
> ipcPort=51020):DataXceiver
> org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
> blk_6345892463926159834_1030 already exists in state RBW and thus cannot be 
> created.
> at 
> org.apache.hadoop.hdfs.server.datanode.FSDataset.createTemporary(FSDataset.java:1324)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:98)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.opWriteBlock(DataXceiver.java:258)
> at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.opWriteBlock(DataTransferProtocol.java:382)
> at 
> org.apache.hadoop.hdfs.protocol.DataTransferProtocol$Receiver.processOp(DataTransferProtocol.java:323)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:111)
> at java.lang.Thread.run(Thread.java:619)
> {code}
> On the sender side:
> {code}
> 2009-10-21 04:57:10,740 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.141:51010, 
> storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, 
> ipcPort=51020) Starting thread to transfer block blk_6345892463926159834_1030 
> to XX.XX.XX.140:51010
> 2009-10-21 04:57:10,770 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeRegistration(XX.XX.XX.141:51010, 
> storageID=DS-1870884070-XX.XX.XX.141-51010-1256100925196, infoPort=51075, 
> ipcPort=51020):Failed to transfer blk_6345892463926159834_1030 to 
> XX.XX.XX.140:51010 got java.net.SocketException: Original Exception : 
> java.io.IOException: Connection reset by peer
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
> at 
> org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:199)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:346)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:434)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:1262)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Connection reset by peer
> ... 8 more
> {code}
> The block sequence number, 1030, is one more than that in issue HDFS-720 
> (same test run but about 8 seconds between errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-723) Deadlock in DFSClient#DFSOutputStream

2009-10-21 Thread Hairong Kuang (JIRA)
Deadlock in DFSClient#DFSOutputStream
-

 Key: HDFS-723
 URL: https://issues.apache.org/jira/browse/HDFS-723
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Hairong Kuang
Assignee: Hairong Kuang
Priority: Blocker
 Fix For: 0.21.0


WhiIe was running some append-related tests, I hit this deadlock:

Found one Java-level deadlock:
=
"Thread-3":
  waiting to lock monitor 0x00012ee044f0 (object 0x000107a0ded0, a 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream),
  which is held by "main"
"main":
  waiting to lock monitor 0x00012eeb71a8 (object 0x0001082b0748, a 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker),
  which is held by "Thread-3"

Java stack information for the threads listed above:
===
"Thread-3":
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3582)
- waiting to lock <0x000107a0ded0> (a 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1175)
- locked <0x0001082b0748> (a 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:306)
- locked <0x00010824d640> (a org.apache.hadoop.hdfs.DFSClient)
- waiting to lock <0x000107a0ded0> (a 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.close(DFSClient.java:1175)
- locked <0x0001082b0748> (a 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker)
at org.apache.hadoop.hdfs.DFSClient.close(DFSClient.java:306)
- locked <0x00010824d640> (a org.apache.hadoop.hdfs.DFSClient)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.close(DistributedFileSystem.java:325)
at org.apache.hadoop.fs.FileSystem$Cache.closeAll(FileSystem.java:1835)
- locked <0x000107a77ec8> (a org.apache.hadoop.fs.FileSystem$Cache)
at 
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer.run(FileSystem.java:1851)
- locked <0x0001079daa00> (a 
org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer)
"main":
at 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker.remove(DFSClient.java:1151)
- waiting to lock <0x0001082b0748> (a 
org.apache.hadoop.hdfs.DFSClient$LeaseChecker)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3609)
- locked <0x000107a0ded0> (a 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
at 
org.apache.hadoop.hdfs.TestFileAppend4.testAppend(TestFileAppend4.java:99)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.extensions.TestSetup.run(TestSetup.java:27)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-722) The pointcut callCreateBlockWriteStream in FSDatasetAspects is broken

2009-10-21 Thread Tsz Wo (Nicholas), SZE (JIRA)
The pointcut callCreateBlockWriteStream in FSDatasetAspects is broken
-

 Key: HDFS-722
 URL: https://issues.apache.org/jira/browse/HDFS-722
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Reporter: Tsz Wo (Nicholas), SZE
 Fix For: 0.21.0, 0.22.0


HDFS-679 changed the signature of createStreams().  So the 
callCreateBlockWriteStream pointcut defined in FSDatasetAspects is no longer 
valid.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.