[ https://issues.apache.org/jira/browse/HBASE-16212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15383158#comment-15383158 ]
stack commented on HBASE-16212: ------------------------------- Do you think we'll connect to the DN less freqently with this patch in place? Thanks. > Many connections to datanode are created when doing a large scan > ----------------------------------------------------------------- > > Key: HBASE-16212 > URL: https://issues.apache.org/jira/browse/HBASE-16212 > Project: HBase > Issue Type: Improvement > Affects Versions: 1.1.2 > Reporter: Zhihua Deng > Attachments: HBASE-16212.patch, HBASE-16212.v2.patch, > regionserver-dfsinputstream.log > > > As described in https://issues.apache.org/jira/browse/HDFS-8659, the datanode > is suffering from logging the same repeatedly. Adding log to DFSInputStream, > it outputs as follows: > 2016-07-10 21:31:42,147 INFO > [B.defaultRpcServer.handler=22,queue=1,port=16020] hdfs.DFSClient: > DFSClient_NONMAPREDUCE_1984924661_1 seek > DatanodeInfoWithStorage[10.130.1.29:50010,DS-086bc494-d862-470c-86e8-9cb7929985c6,DISK] > for BP-360285305-10.130.1.11-1444619256876:blk_1109360829_35627143. pos: > 111506876, targetPos: 111506843 > ... > As the pos of this input stream is larger than targetPos(the pos trying to > seek), A new connection to the datanode will be created, the older one will > be closed as a consequence. When the wrong seeking ops are large, the > datanode's block scanner info message is spamming logs, as well as many > connections to the same datanode will be created. > hadoop version: 2.7.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)