On Sunday, January 4, 2015 3:33:58 PM UTC-8, Lin Jen-Shin wrote:
>
> Hi,
>
> This is bizarre, I really can't figure out what's going on...
> We're running Unicorn with preload_app=true, in before_fork we have:
>
> Model.db.disconnect
>
> and the database is connected via:
>
> Model.db = Sequel.connect(DATABASE_CONFIG)
>
> I believe this should be alright, and indeed it's alright before we
> upgraded to Rails 4.1.8. We were running Rails 3.2.21, and
> we never saw this kind of errors. But after upgraded to Rails 4.1.8,
> we almost always see those weird connection errors
> *a few minutes after restarted*. Here's some examples:
>
> PQconsumeInput() SSL SYSCALL error: EOF detected
> PQconsumeInput() SSL error: decryption failed or bad record mac
>
> I don't understand how could this be related to Rails at all?
> I also tried to verify the file descriptors with process id, and it turns
> out,
> well, indeed different processes was sharing the same file descriptor?
> I am seeing the same number reported by different process. The FD
> was recorded with:
>
> Model.db.pool.hold{ |c| @fd = c.socket }
>
Note that just because multiple children have the same file descriptor
integer for the connection as the parent did before disconnect, does not
mean that the socket is shared. Most operating systems will use the lowest
unused integer for a new file descriptor, so if the parent closes the
connection, multiple child are forked, and each child opens a new
connection, the same integer will be used for all children, even though the
file descriptors themselves are independent.
> However, *I cannot reproduce this on local*, and on production, it looks
> like happening like random within the first few minutes after restarted.
> After awhile, this would no longer happen. And it's not always showing
> the same FD for different processes, only occasionally.
>
> I also verified that the master process didn't hold any connection by
> checking:
>
> Model.db.pool.size
>
> It's always 0 on master and it's also 0 after_fork. I am running out of
> ideas... Any hints would be appreciated.
>
> ruby 2.1.5
> sequel 4.18.0 (doesn't seem to matter)
> rails 4.1.8 (not sure how this would be related)
> unicorn 4.8.3 (don't think this matters, either)
> pg 0.18.0 (could be this??)
>
My first guess is somewhere in your app (or in one of the your app's
dependencies), something is calling fork, then calling exit (instead of
Process.exit!). Sometimes this is done to get cheap backgrounds jobs. In
ruby, this will cause all of the sockets in the child process to be
disconnected. Since the parent shares the database connection sockets with
the child, this also disconnects the connection in the parent (this parent
would be the unicorn worker process, not the unicorn master process).
If that isn't the case, unless you are able to come up with a reliable way
to reproduce the error, it's going to be pretty hard to debug. As a shot in
the dark, you could try using pg 0.17.1 instead of pg 0.18.0 and see if
that has any effect.
Thanks,
Jeremy
--
You received this message because you are subscribed to the Google Groups
"sequel-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sequel-talk.
For more options, visit https://groups.google.com/d/optout.