gizli wrote: > > Hi guys, > > I have a couple of questions about a recent concurrency test I did on > SA. Someone in our team ran into a very intermittent issue where a > certain operation he tried doing failed by ConcurrentUpdateError. I > wrote a simple thread that can do query/update/insert/commit/rollback > operations on demand and created two of those to create specific race > conditions manually.. I found out some interesting stuff: > > If I use sqlite, two threads cannot operate on the same ORM object. > E.g.: > > t1.update(obj) > t2.update(obj) > t1.commit() > t2.commit() > > t2.update(obj) would fail with an error saying "Database is locked".
sqlite has a coarse grained locking scheme. You can obtain better control over it using the tips at http://www.sqlalchemy.org/trac/wiki/DatabaseNotes#SQLite > > If I use mysql to do the same thing it goes through with no problems. mysql has row locking (if youre using InnoDB tables) so contention occurs at a much finer grained level. > The only time I could reproduce the concurrent update error is like > this: > > t1.update(obj) > t2.delete(obj) > t2.commit() > t1.commit() > > The last line will throw the error saying num updated rows is 0 where > it was expecting 1. My first question is: Can this concurrent update > error arise in any other situation? SQLA's checks for this condition only occur with regards to rows which are expected to exist that no longer exist. The practice is a very minimal form of what is known as "optimistic locking", which you can read about here: http://en.wikipedia.org/wiki/Optimistic_concurrency_control SQLA offers a more robust form of "optimistic locking" if you employ the usage of a version ID column (not too different from the timestamp approach described on that wikipedia page). > I also tried the following: > > t1.delete(obj) > t2.delete(obj) > t2.commit() > t1.commit() > > This also fails with the same error saying the num deleted rows is 0 > when it was expecting 1. My second question is: Should t1's > transaction go through? I would think so. why is that ? what happens if you go to your command line and try to "rm" a file that doesn't exist ? should "rm" silently fail ? > I understand why an update > to a deleted row fails the entire transaction but I do not understand > the motivation for failing an entire transaction because a row I > wanted to delete is already deleted. because your program is assuming that a certain row exists which does not. operating on stale data is never a good idea - any number of changes may have occurred between the time you have loaded the row and the time you have deleted the row. > This behavior is, as far as I can see, embedded in SA logic. So no > matter which database I use, I will see this right? you won't when using DBAPIs that do not offer a reliable count of rows updated or deleted. If you'd like to force the entire thing off, do this: engine = create_engine(<url>) engine.dialect.supports_sane_rowcount = False engine.dialect.supports_sane_multi_rowcount = False > The final question is a design question. In a typical web application > that can have 100's of simultaneous users submitting a handful of > tasks, how do we avoid this kind of concurrency issues? > Each request > would be handled with a separate thread meaning if any of these > threads try modifying the same db object, we have a potential problem. > Do we need exclusive locks for this kind of situation? Does SA allow > us to do that? most "social media" types of web applications involve each user issuing UPDATE/DELETE statements to distinct groups of rows which are "owned" by each user. I.e. you're only updating your own comments, your own messages, your own address books, etc. So an optimistic system works very well for that. If OTOH your system is more like a CMS or document management system where many users may be affecting the same row, you have to put more thought into it. Trac uses a locking scheme kind of like the one in the wikipedia page, using timestamps that indicate "user X has a 'lock' on row A". For example if you try to update a ticket in Trac, and someone else has changed the ticket since you last loaded it, a timestamp check fails and you get a message stating that "the ticket has been changed since you last loaded it". You can also use a "pessimistic" system of locking rows. The "standard" pessimistic system may involve rows are literally locked at the DB level using "SELECT..FOR UPDATE", as well as high levels of transaction isolation (in Postgresql this is SERIAL isolation...I'm not as familiar with how MySQL's works). However, you still have to accommodate contention situations explicitly - user 1 displays record A, user 2 displays record A, user 1 issues a POST to delete it, user 2 issues a POST to delete it as well, user 2's request goes to re-load the row to check for deletion and its no longer there, so the app would have to "smooth over" that situation (saying "already deleted" for example). Another approach which is the most heavyhanded of all uses explicit "lock" controls on the application, so that any user has to "lock" a document before modifying, and everyone can see on their screens "this document locked by user X". A "lock" record is set within the database marking the row as under the management of a specific user (again there's often a timeout so that it expires). This is the kind of thing you see if you work in a law firm or newsroom similar with big document management systems. there's a lot of possibilities and you have to tailor one that is specific to the kind of data and userbase your app deals with. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~----------~----~----~----~------~----~------~--~---