On Jul 8, 2013, at 20:34, Jamie Lennox <jlen...@redhat.com> wrote: > On Mon, 2013-07-08 at 21:55 -0400, Adam Young wrote: >> >> Tokens are, for the most part, immutable. Once they are written, they >> don't change except if they get revoked. This is a fairly rare >> occurance, but it does happen. >> >> Deleting tokens based on age should be fairly straight forward, and >> locks should not need to be held for a significant amount of time. >> >> My guess, however, is that the problem is SQL Alchemy: >> >> query = session.query(TokenModel) >> query = query.filter(TokenModel.expires < timeutils.utcnow()) >> query.delete(synchronize_session=False) >> >> If it is doing a fetch and then the delete, then the rows would be >> held for a short period of time. >> >> Direct SQL might be a better approach: prepare a statement: >> >> "delete from token where expires < $1" > > Sqlalchemy already generates this statement. > >> and then bind and execute in one command. >> >> However, it seems to me that the conflict detection is the problem. I >> don't know if there is a way to state "ignore any future queries that >> would match this criteria." It does seem to me that even doing this >> degree of conflict detection is somewhat violating the principal of >> Isolation. >> >> There might be an approach using table partitioning as well, where you >> only write to partition one, and delete from partition 2, and then >> swap. >> >> >> >> >> >> On 07/08/2013 09:13 PM, Robert Collins wrote: >> >>> On 9 July 2013 12:32, Adam Young <ayo...@redhat.com> wrote: >>> >>> * I am asking about MySQL.. presumably a "real" >>> database. >>> I have to admit I am a bit of a Postgresql Bigot. I don't >>> really consider MySQL a real database, althought it has >>> improved a lot over the years. I am not up to speed >>> on"InnoDB's gap locking behavior" but it is not something I >>> would expect to be a problem in Postgresql. >>> >>> PostgreSQL has similar but different characteristics, particular the >>> latest iteration of isolation behaviour where locks are held on *the >>> result of a query*, not on 'specific rows returned' - the difference >>> being that adding a new row that matches the query for rows to >>> delete, would encounter a conflict. You also need to delete small >>> numbers of rows at a time, though the reason in the plumbing is >>> different. There are some nasty interlocks you can cause with very >>> large deletes and autovacuum too - if you trigger deadlock detection >>> it still takes /minutes/ to detect and cleanup, whereas we want >>> sub-second liveness. >>> >>> once every second would be strange indeed. I would think >>> maybe once every five minutes or so. Schedule your clean up >>> IAW your deployment and usage. >>> >>> 5m intervals exacerbate the issue until it's solved. If the cleanup >>> deletes no more than (say) 1000 rows per iteration, it could run >>> every 5 minutes but when run keep going until the db is cleaned. >>> >>> Deleting a chunk of tokens in bulk would be preferable to >>> doing client side iteration, I can;t see how that would not >>> be the case. >>> >>> right, so I think Clint prefers that too, the question is how to get >>> sqlalchemy to output the appropriate sql for postgresql and mysql, >>> which is different. > > I'm not experienced with large databases but i wrote the token_flush so > i'm interested. What am i missing that we can't just add a --limit > parameter to the command line tool so "keystone-manage token_flush > --limit=1000" which is (as was mentioned) deleting 1000 in a command. > > SA will be able to generate a nested query like: > "delete from token where id in (select id from token limit 1000)" > for all databases. What sort of hit is something nested like that? Then > you tweak the limit and the frequency but i would suggest 1000 every 5 > minutes should result in a net negative token count and wouldn't come > close to the locks. >
That is exactly what I tried. MySQL doesn't allow LIMIT on sub queries which feed into the IN() construct. :-( MySQL does have an SQL extension which allows LIMIT in the DELETE statement, but this presents the challenge I originally was asking about. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev