I'm seeing errors which I believe are due to a race condition in
django.db.models.query.get_or_create, on a fairly high traffic site. Our
production servers are running Django 1.2.5, but I don't see any changes in
the code in trunk that would affect this. (I'm totally willing to construct
a test case against trunk, but I'm posting this here in case it's already a
recognized bug, or an error on my part)

If two requests make the same call to get_or_create(), at roughly the same
time, with a database server in REPEATABLE_READ isolation level, then I
believe that it's possible for the following sequence of events to occur:

1. Process 1 enters into a transaction as part of the default view
middleware.
2. Process 1 calls QuerySet.get(**lookup), no result is returned.
(DoesNotExist is raised)
----
3. Process 2 enters into a transaction as part of the default view
middleware.
4. Process 2 calls QuerySet.get(**lookup), no result is
returned. (DoesNotExist is raised)
5. Process 2 calls transaction.savepoint
6. Process 2 saves a new object
7. Process 2 commits and returns the object
----
8. Process 1 calls transaction.savepoint
9. Process 1 tries to save a new object; this locks before #7, above, and
fails after #7, with an IntegrityError
10. Process 1 rolls back to the savepoint, *but does not leave the outer
transaction*
11. Process 1 calls QuerySet.get(**lookup), again, *but because we're still
in the outer transaction, this returns nothing*
12. Process 1 Raises an integrity error, rather than getting the new object.

Process 1 fails, because it performed the initial read inside of a
transaction, but before the save point. In fact, inside of the same
transaction, I believe it is impossible for the initial self.get() and the
self.get() in the exception handler to return different results.

Some SQL-shell testing shows that it's possible for this to work, as long as
we set the savepoint before the initial read. That way, when we catch an
IntegrityError and roll back to the savepoint, the lock is released, and
Process 1 can actually see the object committed by Process 2.

I expect to open up a ticket for this, unless someone can tell me "you're
doing it wrong", or point me to another ticket (I've scanned the trac
database, but didn't see anything identical. 15507 touches this, but won't
actually do anything to solve it.)

-- 
Regards,
Ian Clelland
<clell...@gmail.com>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to