when you have concurrent transactions going on, the database should be responsible for dealing with whatever locking is required. the optimistic approach i have outlined doesn't require any locking, but you could also do this using an explicit table lock - the syntax and availability of explicit table locks is backend-specific.
On Aug 27, 2013, at 12:09 PM, herzaso <herz...@gmail.com> wrote: > Thanks Michael, > I will try that and let you know if it solved my issue. > BTW: Is there a lock mechanism for such conditions? > > Thanks, > Ofir > > > On Tuesday, August 27, 2013 6:40:03 PM UTC+3, Michael Bayer wrote: > I'm not a fan of catching integrity errors, i prefer to try to make sure they > aren't going to happen, or if they are, they aren't a normal occurrence and > the system is such that the particular operation can just fail (of course it > depends on what it is). A problem with catching the integrity error due > to concurrent, conflicting operations is that depending on backend and > isolation level, you can't be totally sure when the error is going to get > raised (e.g. serializable isolation vs. non). Also on a backend like > Postgresql, the database can't recover the transaction after an integrity > error unless you used a savepoint. > > But here you're doing the "concurrent transactions need row identity X", so > maybe it is appropriate here. Here is a rough idea of a transactional > pattern for that, noting this isn't tested: > > try: > my_object = Session.query(MyClass).filter(....).one() > except NoResultFound: > try: > with Session.begin_nested(): # run inside a SAVEPOINT > my_object = MyClass(...) > Session.add(my_object) > Session.flush() > except IntegrityError: > my_object = Session.query(MyClass).filter(....).one() > > > > > > On Aug 27, 2013, at 11:13 AM, herzaso <her...@gmail.com> wrote: > >> Suppose we are looking at a race condition, do you also think this should be >> handled by catching the IntegrityError? >> If so, what should I do? only flush and do the operation again? >> >> On Tuesday, August 27, 2013 5:42:23 PM UTC+3, Michael Bayer wrote: >> the word "occasional" is very meaningful. It usually suggests race >> conditions. Then with the word "tornado", the baysean filters are >> strongly leaning towards "race condition" at that point :). >> >> if an error is occurring only under volume then you have to revisit where >> race conditions can occur. >> >> On Aug 27, 2013, at 10:32 AM, herzaso <her...@gmail.com> wrote: >> >>> I'm running a Tornado server without redundancy (only one process, requests >>> can arrive at the same time but will be handled one at a time) >>> I do agree that for large volumes, catching the IntegrityError would be >>> better, but currently I am handling a single request at a time and I want >>> to fix this problem before I move on ... >>> >>> >>> On Tuesday, August 27, 2013 5:24:07 PM UTC+3, Simon King wrote: >>> On Tue, Aug 27, 2013 at 2:31 PM, herzaso <her...@gmail.com> wrote: >>> > On Tuesday, August 27, 2013 3:55:50 PM UTC+3, Simon King wrote: >>> >> >>> >> On Tue, Aug 27, 2013 at 1:40 PM, herzaso <her...@gmail.com> wrote: >>> >> > I have a model with an ID column set as the primary key, though i'd >>> >> > like >>> >> > to >>> >> > be able to identify records by 3 other columns. >>> >> > For this situation, I've added a classmethod that will fetch the >>> >> > record >>> >> > if >>> >> > found or a new record if not. >>> >> > The problem i'm having is that every once in a while, I get >>> >> > IntegrityError >>> >> > trying to flush a change >>> >> > >>> >> > class Foo(Base): >>> >> > __table_args__ = (sa.UniqueConstraint('bar', 'baz', 'qux'),) >>> >> > >>> >> > id = sa.Column(Identifier, sa.Sequence('%s_id_seq' % >>> >> > __tablename__), >>> >> > nullable=False, primary_key=True) >>> >> > bar = sa.Column(sa.BigInteger) >>> >> > baz = sa.Column(sa.BigInteger) >>> >> > qux = sa.Column(sa.BigInteger) >>> >> > a1 = sa.Column(sa.BigInteger) >>> >> > a2 = sa.Column(sa.BigInteger) >>> >> > >>> >> > @classmethod >>> >> > def get(cls, bar=None, baz=None, qux=None, **kwargs): >>> >> > item = session.query(cls).\ >>> >> > filter(cls.bar== bar).\ >>> >> > filter(cls.baz == baz).\ >>> >> > filter(cls.qux == qux).\ >>> >> > first() >>> >> > >>> >> > if item: >>> >> > for k, v in kwargs.iteritems(): >>> >> > if getattr(item, k) != v: >>> >> > setattr(item, k, v) >>> >> > else: >>> >> > item = cls(bar=bar, baz=baz, qux=qux, **kwargs) >>> >> > >>> >> > return item >>> >> > >>> >> > This is the code I use to add/update records: >>> >> > >>> >> > foo = Foo.get(**item) >>> >> > session.merge(foo) >>> >> > >>> >> > I'm struggling with this problem for some time now, and would >>> >> > appreciate >>> >> > any >>> >> > help ... >>> >> > >>> >> >>> >> I'm not sure of the exact problem, but there are a couple of things >>> >> that you could investigate. >>> >> >>> >> Firstly, session.merge returns a copy of the object, rather than >>> >> adding the object that you supplied into the session. See >>> >> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#merging for >>> >> details. >>> >> >>> >> Secondly, your "get" method sometimes returns objects that are already >>> >> part of the session (if they were in the database), and sometimes >>> >> objects that are not in the session. It would probably be more >>> >> consistent to always return objects that are part of the session, by >>> >> putting "session.add(item)" in your "else" clause. This would get rid >>> >> of the need for session.merge(). (If you want to be able to use the >>> >> "get" with non-global sessions, pass the session as a parameter.) >>> >> >>> >> Finally, if your session isn't auto-flushing, it would be possible for >>> >> you to call "get" twice with the same parameters and get 2 different >>> >> objects back. >>> >> >>> >> You may want to look at the UniqueObject recipe in the wiki: >>> >> http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject >>> >> >>> > Hi Simon, >>> > Thanks for the fast reply. >>> > >>> > I tried adding session.add(item) and session.flush() in the else clause >>> > in >>> > the past but that didn't solve my problem. >>> > I didn't however remove the merge, do you think that might be the >>> > problem? >>> > >>> > Regarding the flush, this code is part of an API server where a >>> > scoped_session is committed after each change. I haven't changed the >>> > autoflush parameter, and as I understand the default value is True making >>> > a >>> > flush before each commit or query. >>> > >>> > As for the UniqueObject recipe, thanks! Amazing that I never found it >>> > searching for a cure. As I see it basically does the same ... >>> > >>> > I never managed to reproduce this bug on my development environment. It >>> > only >>> > happens in my production environment. >>> > Do you suppose adding a session.add and removing the merge will solve >>> > this >>> > issue? >>> > >>> > Thanks, >>> > Ofir >>> >>> It's difficult to say without knowing more about your system. For >>> example, does your production system get multiple concurrent API >>> requests, or are they serialised? If 2 requests can come in at >>> approximately the same time and are handled by 2 different threads (or >>> processes), then it is easy to imagine that the first handler will >>> check the database, find that an entry doesn't exist, and create it. >>> But before it flushes the change to the database (or even after it >>> flushes, but before it commits, depending on your transaction >>> isolation), the second handler will check for the same object, find it >>> missing, and so create it. >>> >>> To track down problems like this, you could ensure that your >>> development environment has the same thread/process behaviour as the >>> production environment, then try submitting multiple concurrent >>> requests to it. If you add "time.sleep" statements somewhere between >>> the creation of the object and the commit of the transaction you will >>> probably find it easier to trigger. >>> >>> To actually fix the problem, you could choose to only handle a single >>> request at a time (fine if you don't expect a high volume of >>> requests). If that's not acceptable, you could catch the >>> IntegrityError and then re-process the request. >>> >>> Hope that helps, >>> >>> Simon >>> >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "sqlalchemy" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to sqlalchemy+...@googlegroups.com. >>> To post to this group, send email to sqlal...@googlegroups.com. >>> Visit this group at http://groups.google.com/group/sqlalchemy. >>> For more options, visit https://groups.google.com/groups/opt_out. >> >> >> -- >> You received this message because you are subscribed to the Google Groups >> "sqlalchemy" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to sqlalchemy+...@googlegroups.com. >> To post to this group, send email to sqlal...@googlegroups.com. >> Visit this group at http://groups.google.com/group/sqlalchemy. >> For more options, visit https://groups.google.com/groups/opt_out. > > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To post to this group, send email to sqlalchemy@googlegroups.com. > Visit this group at http://groups.google.com/group/sqlalchemy. > For more options, visit https://groups.google.com/groups/opt_out.
signature.asc
Description: Message signed with OpenPGP using GPGMail