On Mon, Jan 5, 2015 at 2:10 PM, Jeremy Evans <[email protected]> wrote:
> Note that just because multiple children have the same file descriptor
> integer for the connection as the parent did before disconnect, does not
> mean that the socket is shared.  Most operating systems will use the lowest
> unused integer for a new file descriptor, so if the parent closes the
> connection, multiple child are forked, and each child opens a new
> connection, the same integer will be used for all children, even though the
> file descriptors themselves are independent.

Oh, good to know. Thank you. So this should not be a proof for shared
file descriptors. Guess I should look for sudden disconnecting instead of
why somehow the FDs were shared.

> My first guess is somewhere in your app (or in one of the your app's
> dependencies), something is calling fork, then calling exit (instead of
> Process.exit!).  Sometimes this is done to get cheap backgrounds jobs.  In
> ruby, this will cause all of the sockets in the child process to be
> disconnected.  Since the parent shares the database connection sockets with
> the child, this also disconnects the connection in the parent (this parent
> would be the unicorn worker process, not the unicorn master process).

I was excited when reading this, and I just searched /\bfork\b/ through
all the dependencies. No luck though :( All the calls to fork were in tests.
I believe they were not called on production.

Guess the lesson here is I should be very careful with fork and at_exit.
I was not aware of this behaviour.

> If that isn't the case, unless you are able to come up with a reliable way
> to reproduce the error, it's going to be pretty hard to debug. As a shot in
> the dark, you could try using pg 0.17.1 instead of pg 0.18.0 and see if that
> has any effect.

Just downgraded to pg 0.17.1 and no luck either :(
Too bad I can't downgrade Rails in order to test this...
It's pretty reliable to reproduce this *on production*,
just restart the server and I shall see them happening.
Never ever successfully reproduced this on local nor
staging server though. Guess I can only debug this
on production.. Maybe it might be a good idea to tell
Sequel to reconnect while debugging this, so that
at least it's not giving people 500 error pages while
debugging?

The other thing I am wondering, we're also using
ActiveRecord, why we didn't see errors for them?

> Thanks,
> Jeremy

Thank you!

-- 
You received this message because you are subscribed to the Google Groups 
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.

Reply via email to