[ 
https://issues.apache.org/jira/browse/HBASE-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383514#comment-15383514
 ] 

Zhihua Deng commented on HBASE-16212:
-------------------------------------

Thanks [~stack]. From the logging, it implies that different threads share the 
same DFSInputStream instance, say 'defaultRpcServer.handler=7'(handler7) and 
'defaultRpcServer.handler=4'(handler4), for example. The original will prefect 
the next block header and cache the header into thread. When 
defaultRpcServer.handler=4 comes,  it first checks that the cached header 
offset is equal to the the block starting offset, unfortunately these two 
numbers are unequal(-1 != offset). The handler4 knows nothing about the block 
header,  though the header has been prefected by handler7.  The handler4 needs 
to seek the inputstream with the block starting offset for obtaining the 
header,  while the inputstream has been over read by 33 bytes(the header size). 
So a new connection to datanode should be recreated, the elder one will be 
closed. When the datanode writes to a closed channel, an socket exception will 
be raised. When the same case happens frequently, the datanode will be suffered 
from logging the message described as it is.

> Many connections to datanode are created when doing a large scan 
> -----------------------------------------------------------------
>
>                 Key: HBASE-16212
>                 URL: https://issues.apache.org/jira/browse/HBASE-16212
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.1.2
>            Reporter: Zhihua Deng
>         Attachments: HBASE-16212.patch, HBASE-16212.v2.patch, 
> regionserver-dfsinputstream.log
>
>
> As described in https://issues.apache.org/jira/browse/HDFS-8659, the datanode 
> is suffering from logging the same repeatedly. Adding log to DFSInputStream, 
> it outputs as follows:
> 2016-07-10 21:31:42,147 INFO  
> [B.defaultRpcServer.handler=22,queue=1,port=16020] hdfs.DFSClient: 
> DFSClient_NONMAPREDUCE_1984924661_1 seek 
> DatanodeInfoWithStorage[10.130.1.29:50010,DS-086bc494-d862-470c-86e8-9cb7929985c6,DISK]
>  for BP-360285305-10.130.1.11-1444619256876:blk_1109360829_35627143. pos: 
> 111506876, targetPos: 111506843
>  ...
> As the pos of this input stream is larger than targetPos(the pos trying to 
> seek), A new connection to the datanode will be created, the older one will 
> be closed as a consequence. When the wrong seeking ops are large, the 
> datanode's block scanner info message is spamming logs, as well as many 
> connections to the same datanode will be created.
> hadoop version: 2.7.1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to