On Thu, Apr 17, 2014 at 6:51 AM, Hansi Klose <hansi.kl...@web.de> wrote:
> Hi, > > we use a script to take on a regular basis snapshot's and delete old one's. > > We recognizes that the web interface of the hbase master was not working > any more becaues of too many open files. > > The master reaches his number of open file limit of 32768 > > When I run lsof I saw that there where a lot of TCP CLOSE_WAIT handles open > with the regionserver as target. > > On the regionserver there is just one connection to the hbase master. > > I can see that the count of the CLOSE_WAIT handles grow each time > i take a snapshot. When i delete on nothing changes. > Each time i take a snapshot there are 20 - 30 new CLOSE_WAIT handles. > > Why does the master do not close the handles? Is there a parameter > with a timeout we can use? > > We use hbase 0.94.2-cdh4.2.0. > Does https://issues.apache.org/jira/browse/HBASE-9393?jql=text%20~%20%22CLOSE_WAIT%22help? In particular, what happens if you up the socket cache as suggested on the end of the issue? St.Ack