[sqlalchemy] Practical ways for dealing with concurrency problems?

Russell Warren Tue, 09 Nov 2010 20:51:28 -0800

I've got several questions about dealing with concurrency issues in
SQLAlchemy.  To try and help with the asking, I'll use a small example
similar to a situation I've got in an SQLAlchemy application.  In this
application, there is a table that tracks objects and their order/
position, something like below...


Name   Pos
----   ---
ObjA    1
ObjB    2
ObjC    3

And this can change to the following with a new object inserted
"before" (position-wise) ObjB...

Name   Pos
----   ---
ObjA    1
ObjB    3
ObjC    4
NewObj  2

Some sample code is whipped up to do this is at the bottom of this
post.

Worth noting is that my application is currently database-agnostic
(SQLAlchemy enabling this is a great feature), and I'd like to keep it
that way if at all possible.

My questions revolve around concurrency issues because the insert
operation can potentially update all rows in the table.  It seems that
this is very vulnerable to concurrency problems since each transaction
will only update its own rows that existed at the start of the
transaction.  If a parallel transaction is started doing a similar
insertion, a record can be potentially get a duplicate position
number, which would destroy the application logic.

I'm much more of an application person than a DB person, so to solve
this my immediate thinking is to simply do it all at the application
layer and use a thread lock on the table updating function.  All table
access is routed through the application (which uses scoped sessions)
so this would certainly "work", but I'm certain that those more
familiar with sqla and the DB layer would likely cringe at that
approach.

I really want to learn more about dealing with concurrency at the
database level through sqlalchemy.  What is a good way to deal with
this in sqla?

One possibility seems to be to get it all done in one atomic commit.
But how can I use the ORM (together with the expression language?) to
increment by 1 all positions where the position value is > than the
position for a given object name?  For a single value, the docs show
me that can get the proper SQL emitted for a record with something
like:

obj.position = obj.__table__.c.position + 1

where obj is one of my mapped objects.  But how do I do it for
multiple rows and not just the one obj?  Also - is going through
__table__ the most direct way for achieving this?

After the application layer blocking, the next most brutal thing that
comes to mind is locking the entire table during the operation to
avoid problems.  I'm not clear this will work since it seems a read
lock is needed, and I don't yet know if you can actually do that.  It
may be irrelevant, since I certainly do not see any way of doing table
locks through sqla at the moment. If locking were an option, it seems
it would require direct (potentialy DB specifc) SQL statement
execution through session.execute() or conn.execute().  Is that
correct?

Something that I've seen referenced a few times for dealing with
concurrency in sqlalchemy is to use "with_lockmode()" on a query to
lock the relevant rows.  However - I'm wary of this for a few
reasons:  First is that I don't think it would work in my example case
because read blocking is needed (I think).  Second is that I can't
actually figure out what with_lockmode actually does.  Digging in the
code I see the lock options are "read", "update", "update_nowait".
Short of finding this in query.py, where are the docs for these
options??  The third bit of wariness is uncertain support for
different dialects.  Digging through my 'SQL in a Nutshell' I see
varying DB support for "SELECT FOR UPDATE" syntax (which is what
with_lockmode seems to employ?) across various RDMS so I'd rather stay
away from it if possible.  What is the best way to figure out the
database dialect support for each of these options, short of trying
with all potential database types?

Although I'm certainly interested in specific ideas on dealing with
something similar to the table update described, any links or tips
anyone has for dealing with concurrency in SQLAlchemy would be
appreciated.  This includes any expansions anyone has on how the UOW
pattern and ORM transactions alone help with some forms of
concurrency, as I know they do to some extent.

Thanks,
Russ

## sample code with concurrency issue...

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Table, Column, Integer, String, MetaData
from sqlalchemy import create_engine
from pprint import pprint
engine = create_engine('sqlite:///:memory:', echo = False)
metadata = MetaData()

from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind = engine)

Base = declarative_base()
class OrderedObject(Base):
    __tablename__ = 'objects'

    id = Column(Integer, primary_key = True)
    name = Column(String)
    position = Column(Integer)

    def __init__(self, name, name_to_bump = ""):
        self._ormSess = Session() #does embedding this scare anyone?
        self.name = name
        if name_to_bump:
            self._InsertBefore(name_to_bump)
        else:
            #Count below may be inaccurate with concurrent users!
            newPos = self._ormSess.query(OrderedObject).count() + 1
            self.position = newPos
        self._ormSess.add(self)
        self._ormSess.commit()

    def _InsertBefore(self, name_to_bump):
        #This (potentially) modifies all rows in the table.  Need to
watch out
        #for concurrency issues...
        self._ormSess = Session()
        allObjects = self._ormSess.query(OrderedObject).all()
        pos = 0
        for obj in allObjects:
            pos += 1
            if obj.name == name_to_bump:
                self.position = pos
                pos += 1
            obj.position = pos

    def __repr__(self):
        return "<Object('%s' at %d)>" % (self.name, self.position)

Base.metadata.create_all(engine)

sess = Session()
initialObjects = [OrderedObject("Object_%s" % chr(65 + i)) for i in
range(5)]
print "Initial Objects..."
print "------------------"
pprint(sess.query(OrderedObject).order_by(OrderedObject.position).all())

OrderedObject("BEFORE_B", "Object_B")
print "\nBumped Objects..."
print "------------------"
pprint(sess.query(OrderedObject).order_by(OrderedObject.position).all())

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

[sqlalchemy] Practical ways for dealing with concurrency problems?

Reply via email to