[ 
https://issues.apache.org/jira/browse/HDFS-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510716#comment-14510716
 ] 

Steve Loughran commented on HDFS-8160:
--------------------------------------

nothing obvious springs to mind. what happens if you kill that first DN?

> Long delays when calling hdfsOpenFile()
> ---------------------------------------
>
>                 Key: HDFS-8160
>                 URL: https://issues.apache.org/jira/browse/HDFS-8160
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 2.5.2
>         Environment: 3-node Apache Hadoop 2.5.2 cluster running on Ubuntu 
> 14.04 
> dfshealth overview:
> Security is off.
> Safemode is off.
> 8 files and directories, 9 blocks = 17 total filesystem object(s).
> Heap Memory used 45.78 MB of 90.5 MB Heap Memory. Max Heap Memory is 889 MB.
> Non Heap Memory used 36.3 MB of 70.44 MB Commited Non Heap Memory. Max Non 
> Heap Memory is 130 MB.
> Configured Capacity:  118.02 GB
> DFS Used:     2.77 GB
> Non DFS Used: 12.19 GB
> DFS Remaining:        103.06 GB
> DFS Used%:    2.35%
> DFS Remaining%:       87.32%
> Block Pool Used:      2.77 GB
> Block Pool Used%:     2.35%
> DataNodes usages% (Min/Median/Max/stdDev):    2.35% / 2.35% / 2.35% / 0.00%
> Live Nodes    3 (Decommissioned: 0)
> Dead Nodes    0 (Decommissioned: 0)
> Decommissioning Nodes 0
> Number of Under-Replicated Blocks     0
> Number of Blocks Pending Deletion     0
> Datanode Information
> In operation
> Node  Last contact    Admin State     Capacity        Used    Non DFS Used    
> Remaining       Blocks  Block pool used Failed Volumes  Version
> hadoop252-3 (x.x.x.10:50010)  1       In Service      39.34 GB        944.85 
> MB       3.63 GB 34.79 GB        9       944.85 MB (2.35%)       0       2.5.2
> hadoop252-1 (x.x.x.8:50010)   0       In Service      39.34 GB        944.85 
> MB       4.94 GB 33.48 GB        9       944.85 MB (2.35%)       0       2.5.2
> hadoop252-2 (x.x.x.9:50010)   1       In Service      39.34 GB        944.85 
> MB       3.63 GB 34.79 GB        9       944.85 MB (2.35%)       0       2.5.2
> java version "1.7.0_76"
> Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
>            Reporter: Rod
>
> Calling hdfsOpenFile on a file residing on target 3-node Hadoop cluster 
> (described in detail in Environment section) blocks for a long time (several 
> minutes).  I've noticed that the delay is related to the size of the target 
> file. 
> For example, attempting to hdfsOpenFile() on a file of filesize 852483361 
> took 121 seconds, but a file of 15458 took less than a second.
> Also, during the long delay, the following stacktrace is routed to standard 
> out:
> 2015-04-16 10:32:13,943 WARN  [main] hdfs.BlockReaderFactory 
> (BlockReaderFactory.java:getRemoteBlockReaderFromTcp(693)) - I/O error 
> constructing remote block reader.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while 
> waiting for channel to be ready for connect. ch : 
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
>       at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
>       at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
>       at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> 2015-04-16 10:32:13,946 WARN  [main] hdfs.DFSClient 
> (DFSInputStream.java:blockSeekTo(612)) - Failed to connect to 
> /10.40.8.10:50010 for block, add to deadNodes and continue. 
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while 
> waiting for channel to be ready for connect. ch : 
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while 
> waiting for channel to be ready for connect. ch : 
> java.nio.channels.SocketChannel[connection-pending remote=/10.40.8.10:50010]
>       at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:533)
>       at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:3101)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:755)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:670)
>       at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:576)
>       at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:800)
>       at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:854)
>       at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:143)
> I have also seen similar delays and stacktrace printout when executing dfs CL 
> commands on those same files (df -cat, df -tail, etc.).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to