subashkc1 <[email protected]> wrote:
> Hi Eric,
> 
> Here is the trace from accept4 upto SIGKILL
> 
> 46585 16:32:39 accept4(11, {sa_family=AF_INET, sin_port=htons(59496), 
> sin_addr=inet_addr("127.0.0.1")}, [128 => 16], SOCK_CLOEXEC) = 9
> 46585 16:32:39 recvfrom(9, 0x56012fa88ca0, 16384, MSG_DONTWAIT, NULL, NULL) = 
> -1 EAGAIN (Resource temporarily unavailable)
> 46585 16:32:39 getpid()                 = 46585
> 46585 16:32:39 ppoll([{fd=9, events=POLLIN}], 1, NULL, NULL, 8 <unfinished 
> ...>
> 46601 16:33:38 <... ppoll resumed>)     = 0 (Timeout)
> 46601 16:33:38 getpid()                 = 46585
> 46601 16:33:38 read(3, 0x7fc1327506a0, 8) = -1 EAGAIN (Resource temporarily 
> unavailable)
> 46601 16:33:38 getpid()                 = 46585
> 46601 16:33:38 sched_yield()            = 0
> 46601 16:33:38 ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=60, tv_nsec=0}, 
> NULL, 8 <unfinished ...>) = ?
> 46586 16:33:40 <... futex resumed>)     = ?
> 46585 16:33:40 <... ppoll resumed> <unfinished ...>) = ?
> 46601 16:33:40 +++ killed by SIGKILL +++
> 46586 16:33:40 +++ killed by SIGKILL +++
> 46585 16:33:40 +++ killed by SIGKILL +++

OK, threadid 46485 looks like the worker thread, and
46601 looks like the Ruby timer thread.

FD=9 is the client socket, so it looks like you have a client
that's opening a connection and not doing anything so recvfrom()
fails and ppoll times out.  You'd need to track down why you
have a client opening a connection like that.

You can use `ss -tp' to dump the TCP ports and processes
using them.  So you can find the process which pairs with
the socket accepted to find the culprit client.

In your above process, you'd look for the one which pairs with
127.0.0.1:59496 (based on what I see in the accept4 line);
but the port changes every connection.

So something like: `ss -tp | grep 127.0.0.1:59496' should show
both the unicorn worker and the local client process.

> I've uploaded the whole strace dump into codeshare if you need
> the whole file, https://codeshare.io/3AdLy6.

It's not viewable to me since it requires JS; but shouldn't be
needed.  The 16:32:xx => 16:33:xx range was enough.

Reply via email to