+1 to fix Oslo's service module any ways, irrespective of this bug. +1 to "The db library shouldn't be concerned with whether or not it's in a forked process -- that's not its job"
-- dims On Fri, Feb 20, 2015 at 10:17 AM, Doug Hellmann <d...@doughellmann.com> wrote: > > > On Thu, Feb 19, 2015, at 09:45 PM, Mike Bayer wrote: >> >> >> Doug Hellmann <d...@doughellmann.com> wrote: >> >> >> 5) Allow this sort of connection sharing to continue for a deprecation >> >> period with apppropriate logging, then make it a hard failure. >> >> >> >> This would provide services time to find and fix any sharing problems >> >> they might have, but would delay the timeframe for a final fix. >> >> >> >> 6-ish) Fix oslo-incubator service.py to close all file descriptors after >> >> forking. >> >> >> > >> > I'm not sure why 6 is "slower", can someone elaborate on that? >> >> So, option “A”, they call engine.dispose() the moment they’re in a fork, >> the activity upon requesting a connection from the pool is: look in pool, >> no connections present, create a connection and return it. > > This feels like something we could do in the service manager base class, > maybe by adding a "post fork" hook or something. > > Josh's patch to forcibly close all file descriptors may be something > else we want, but if we can reset open connections "cleanly" when we > know how, that feels better than relying on detecting broken sockets. > >> >> Option “5”, the way the patch is right now to auto-invalidate on >> detection of new fork, the activity upon requesting a connection is from >> the pool is: look in pool, connection present, check that os.pid() >> matches what we’ve associated with the connection record, if not, we >> raise an exception indicating “invalid”, this is immediately caught, sets >> the connection record as “invalid”, the connection record them >> immediately disposes that file descriptor, makes a new connection and >> returns that. >> >> Option “6”, the new fork starts, the activity upon requesting a >> connection from the pool is: look in pool, connection present, perform >> the oslo.db “ping” event, ping event emits “SELECT 1” to the MySQLdb >> driver, driver attempts to emit this statement on the socket, socket >> communication fails, MySQLdb converts to an exception, exception is >> raised, SQLAlchemy catches the exception, sends it to a parser to >> determine the nature of the exception, we see that it’s a “disconnect” >> exception, we set the “invalidate” flag on the exception, we re-raise, >> oslo.db’s exc_filters then catch the exception, more string parsers get >> involved, we determine we need to raise an oslo.db.DBDisconnect >> exception, we raise that, the “SELECT 1” ping handler catches that, we >> then emit “SELECT 1” again so that it reconnects, we then hit the >> connection record that’s in “invalid” state so it knows to reconnect, it >> reconnects and the “SELECT 1” continues on the new connection and we >> start up. >> >> So essentially option “5” (the way the gerrit is right now) has a subset >> of the components of “6”; “6” has the additional steps of: emit a doomed >> statement on the closed socket, then when it fails raise / catch / parse >> / reraise / catch / parse / reraise that exception. Option “5” just >> has, check the pid, raise / catch an exception. >> >> IMO the two options are: “5”, check the pid and recover or “3” make it a >> hard failure. > > And I don't think we want the database library doing anything with this > case at all. Recovery code is tricky, and often prevents valid use cases > (perhaps the parent *meant* for the child to reuse the open connection > and isn't going to continue using it so there won't be a conflict). > > The bug here is in the way the application, using Oslo's service module, > is forking. We should fix the service module to make it possible to fork > correctly, and to have that be the default behavior. The db library > shouldn't be concerned with whether or not it's in a forked process -- > that's not its job. > > Doug > >> >> __________________________________________________________________________ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev