Avi Zrachya created HBASE-9393: ---------------------------------- Summary: Hbase dose not closing a closed socket resulting in many CLOSE_WAIT Key: HBASE-9393 URL: https://issues.apache.org/jira/browse/HBASE-9393 Project: HBase Issue Type: Bug Affects Versions: 0.94.2 Environment: Centos 6.4 - 7 regionservers/datanodes, 8 TB per node, 7279 regions Reporter: Avi Zrachya
HBase dose not close a dead connection with the datanode. This resulting in over 60K CLOSE_WAIT and at some point HBase can not connect to the datanode because too many mapped sockets from one host to another on the same port. The example below is with low CLOSE_WAIT count because we had to restart hbase to solve the porblem, later in time it will incease to 60-100K sockets on CLOSE_WAIT [root@hd2-region3 ~]# netstat -nap |grep CLOSE_WAIT |grep 21592 |wc -l 13156 [root@hd2-region3 ~]# ps -ef |grep 21592 root 17255 17219 0 12:26 pts/0 00:00:00 grep 21592 hbase 21592 1 17 Aug29 ? 03:29:06 /usr/java/jdk1.6.0_26/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Xmx8000m -ea -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -Dhbase.log.dir=/var/log/hbase -Dhbase.log.file=hbase-hbase-regionserver-hd2-region3.swnet.corp.log ... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira