Hi Anthony,

>- What OS for client and server?
Server: CentOS release 6.10
Client: Windows 10 and CentOS 6.*

The cache is connected to by durable and non durable clients. Durable
clients connect from Windows 10. And non durable clients connect from
backend CentOS servers. The server cache has no custom networking
code/libraries. It's purely a geode cache.

- What is the scenario?  Is it “normal” operation or is the client or
server killed?

Server is not killed and continuously running. Clients are also
normally connected. But they may lose connections and reconnect. The
issue is definitely caused by client side connections to server. But
I'm not able to isolate any specific exception from geode log.

>- Does netstat give you any additional information about the sockets?  Are any 
>in TIME_WAIT status?

Ran "netstat -altnup" and all tcp connections have only LISTEN and
ESTABLISHED states for the server process.

>- Do you have a reproducible test case?

Unfortunately no. I'm trying to isolate from geode logs what can be
causing the issue.

>- Do you have a tcpdump of the socket?

No. I'll try it. But not sure it will help? What to look for?
Unfortunately I'm not sure which connection causes the problem.

>- Are you seeing the sockets clean up over time or do they persist until a 
>reboot?
They seem to persist till jvm restart. Forcing GC has no effect on
their count. It slowly creeps up.

On Tue, Nov 16, 2021 at 12:44 PM Anthony Baker <bak...@vmware.com> wrote:
>
> Hi, thanks for this report.  Some questions to help us help you—
>
> - What OS for client and server?
> - Are you seeing the sockets clean up over time or do they persist until a 
> reboot?
> - Does netstat give you any additional information about the sockets?  Are 
> any in TIME_WAIT status?
> - Do you have a tcpdump of the socket?
> - What is the scenario?  Is it “normal” operation or is the client or server 
> killed?
> - Do you have a reproducible test case?
>
> Thanks,
> Anthony
>
>
>
>
> > On Nov 16, 2021, at 9:28 AM, Leon Finker <leon...@gmail.com> wrote:
> >
> > Hi,
> >
> > We observe in our geode (1.14 - same before as well in 1.13) cache
> > server (that supports durable client sessions) an increase in half
> > opened sockets. It seems there is a socket leak. Could someone
> > recommend how to track the leak down? It's not obvious where it's
> > leaking...I can only suspect AcceptorImpl.run and where it only
> > handles IOException. but I wasn't able to reproduce it in debugger
> > yet...
> >
> > lsof -p 344|grep "can't"
> >
> > java 344   user  133u  sock                0,6       0t0 115956017
> > can't identify protocol
> > java 344   user  142u  sock                0,6       0t0 113361870
> > can't identify protocol
> > java 344   user  143u  sock                0,6       0t0 111979650
> > can't identify protocol
> > java 344   user  156u  sock                0,6       0t0 117202529
> > can't identify protocol
> > java 344   user  178u  sock                0,6       0t0 113357568
> > can't identify protocol
> > ...
> >
> > lsof -p 344|grep "can't"|wc -l
> > 934
> >
> > Thank you
>

Reply via email to