[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

Jin Feng (JIRA) Tue, 14 May 2013 16:50:15 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-9564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13657707#comment-13657707
 ]


Jin Feng commented on HADOOP-9564:
----------------------------------

This is captured from our test output, seems like the DataBlockScanner is 
really slow coming up with verification results even though our data/file 
generated in the tests are minimal.

{noformat}
13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
<our_test_file_name.lzo>-f773a37f-3dac-4337-a6cb-004fb94c1d31. 
blk_-8485988660073681466_1002
13/05/04 06:50:55 INFO datanode.DataNode: Receiving block 
blk_-8485988660073681466_1002 src: /127.0.0.1:42563 dest: /127.0.0.1:35830
13/05/04 06:50:55 INFO DataNode.clienttrace: src: /127.0.0.1:42563, dest: 
/127.0.0.1:35830, bytes: 303, op: HDFS_WRITE, cliID: DFSClient_-854844208, 
offset: 0, srvID: DS-1070312150-10.35.8.106-35830-1367650255272, blockid: 
blk_-8485988660073681466_1002, duration: 778000
13/05/04 06:50:55 INFO datanode.DataNode: PacketResponder 0 for block 
blk_-8485988660073681466_1002 terminating
13/05/04 06:50:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 127.0.0.1:35830 is added to blk_-8485988660073681466_1002 
size 303
13/05/04 06:52:48 INFO datanode.DataBlockScanner: Verification succeeded for 
blk_-8485988660073681466_1002
13/05/04 07:00:40 INFO datanode.DataBlockScanner: Verification succeeded for 
blk_586310994067086116_1001
{noformat}


Could this be related to this bug: HADOOP-4584?
                
> DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete
> ------------------------------------------------------------------------------
>
>                 Key: HADOOP-9564
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9564
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>            Reporter: Jin Feng
>            Priority: Minor
>
> Hi,
> Our component uses FileSystem.copyFromLocalFile to copy a local file to HDFS 
> cluster. It's working fine in production environment. Its integration tests 
> used to run fine on our dev's local Mac laptop until recently (exact point of 
> time unknown) our tests started to freeze up very frequently with this stack:
> {code}
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x0000000152f41378> (a 
> java.util.concurrent.FutureTask$Sync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
>       at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
>       at java.util.concurrent.FutureTask.get(FutureTask.java:111)
>       at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:790)
>       - locked <0x000000014f568720> (a java.lang.Object)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1080)
>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:226)
>       at $Proxy37.complete(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:601)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>       at $Proxy37.complete(Unknown Source)
>       at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3566)
>       - locked <0x0000000152f3f658> (a 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>       at 
> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3481)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>       at 
> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>       at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:59)
>       at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:89)
>       at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
>       at 
> org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1295)
>         ....
>         ....
> {code}
> our version is 0.20.2.cdh3u2-t1.
> In the test suite, we use org.apache.hadoop.hdfs.MiniDFSCluster. I've 
> searched around couldn't find anything resembles this symptom, any helps are 
> really appreciated!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-9564) DFSClient$DFSOutputStream.closeInternal locks up waiting for namenode.complete

Reply via email to