I believe sqlalchemy already tracks the session state, though I haven't seen anything (in the docs or code that I've touched) suggesting it knows how to be selective in issuing ROLLBACK. It would be a good feature, though I still think the sql_idle_timeout should be lowered ;)
-Deva On Wed, Jul 11, 2012 at 1:43 PM, Mark Gius <m...@markgius.com> wrote: > I would also love to see these changes applied. > > With regards to the bugs around not issuing a commit or rollback, is it > possible to have sqlachemy track whether or not a transaction starts and > only issue a rollback when a session is handed back with an open > transaction on it? Seems like a useful defensive measure. > > Mark > > On Wed, Jul 11, 2012 at 12:46 PM, Jay Pipes <jaypi...@gmail.com> wrote: > >> +1 to all your ideas below, Devananda. >> >> On 07/11/2012 01:33 PM, Devananda van der Veen wrote: >> > Hi all, >> > >> > I've been taking a look at the way Nova uses its MySQL database. Having >> > done MySQL performance audits for years as a consultant, a few things >> > jumped out right away at me. First is the way that SQLAlchemy is >> > wrapping nearly every query in an unnecessary "ping check" and rollback, >> > eg.: >> > >> > select 1; >> > select ... from ... where ...; >> > rollback; >> > select 1; >> > update ... where ...; >> > commit; >> > rollback; >> > >> > >> > You can find a complete sample here: >> http://paste.openstack.org/show/18731/ >> > >> > I think I understand the reason for both the "select 1" and the >> > "rollback" statements. However, in the interest of performance for >> > large-scale deployments, I feel pretty strongly that they need to go >> away. >> > >> > As I see it, there are three factors here. >> > >> > (I) Most of the code in db/sqlalchemy/api.py abstracts a "unit of work" >> > to a very low level, generally a single database read or write, and does >> > not currently support transactions spanning multiple consistent writes. >> > This is why "select 1" and "rollback" appear around almost every query >> > -- most functions in api.py is checking out a session object, doing one >> > or two queries, and then releasing the session. This actually creates a >> > much larger issue about transactional atomicity for larger operations, >> > such as the open bug about network creation here, and is probably better >> > for another discussion. >> > https://bugs.launchpad.net/nova/+bug/755138. >> > >> > (II) Connections are tested by the MySQLPingListener function, using >> > "SELECT 1" statements, every time a connection / session is checked out >> > of the pool. Since connections are usually OK, this adds overhead >> > unnecessarily. It would be more efficient to handle the errors when >> > connections aren't OK. I've opened a bug with a description of one >> > possible way to fix this issue. There are probably other viable >> > solutions as well. >> > https://bugs.launchpad.net/nova/+bug/1007027 >> > >> > My understanding is that this was implemented to prevent "Database has >> > gone away" errors from occurring every time that the database is >> > restarted, or when connections are closed by the database after being >> > idle for long periods. In my opinion, a better solution to these >> > problems is to: >> > * wrap queries in retry logic, which will catch disconnect errors and >> > attempt to reconnect to the database. I have a patch for this, if folks >> > are interested. >> > * set Nova's sql_idle_timeout to less than MySQL's wait_timeout, and set >> > them both to a reasonably short interval (eg, 1 minute), rather than the >> > default (which I think is 8 hours). >> > >> > (III) Transaction state is reset when connections are returned to the >> > pool, even if no transaction was in progress, or the >> > transaction-in-progress already committed. This is completely wasteful, >> > and easy to disable. >> > https://bugs.launchpad.net/nova/+bug/1007038 >> > >> > Caveat here is that this reset-on-return functionality *is* useful when >> > code doesn't properly clean up its own transaction state. When I turned >> > it off, it exposed a few bugs, which I haven't tracked down yet. >> > Lowering the sql_idle_timeout will provide the same "solution" as the >> > current reset-on-return behavior in that it will prevent long-running >> > idle transactions that tie up resources unnecessarily. >> > >> > >> > In summary, I'd like to see Nova stop spamming the database with "SELECT >> > 1" and "ROLLBACK", and think this should be pretty easy to do. Testing >> > these two changes in my devstack seems to work fine, but I'd like to >> > know what others think before I propose a changeset with this kind of >> > potential impact. >> > >> > Cheers, >> > Devananda >> > >> > >> > _______________________________________________ >> > Mailing list: https://launchpad.net/~openstack >> > Post to : openstack@lists.launchpad.net >> > Unsubscribe : https://launchpad.net/~openstack >> > More help : https://help.launchpad.net/ListHelp >> > >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~openstack >> Post to : openstack@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~openstack >> More help : https://help.launchpad.net/ListHelp >> > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : openstack@lists.launchpad.net > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > >
_______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp