Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Simon King Thu, 05 Sep 2013 03:14:27 -0700

(Remember, I still know next to nothing about Tornado, so this is pure
assumption on my part)


By default, SQLAlchemy scoped sessions use thread-local variables to
ensure that each thread gets its own session. Tornado's asynchronous
features process multiple requests "simultaneously" on the same
thread. So consider this set of steps

1. Request 1 comes in for your "special" asynchronous action. Since
this is the first request, a new thread-local session gets created.

2. Your special action does whatever it is that causes it to pause and
yield control back to Tornado

3. Request 2 comes in for a normal synchronous action. Tornado handles
it on the same thread that processed Request 1. Since the session is
still open from Request 1, it gets re-used, including whatever state
was set up in Request 1.

4. At the end of Request 2, you call session.commit/rollback/close

5. Tornado hands control back to the Request 1 handler, which now
checks out a new session (because its previous one has been closed).
Any state that Request 1 had previously put in the session will have
been committed or rolled back at the end of Request 2

Putting a "session.close" at the beginning of every request won't
really fix the problem. It just means that in the example above,
Request 2 would throw away whatever Request 1 was doing.

I don't think you can use threadlocal scoped sessions with
asynchronous tornado code. You could use a different scoping mechanism
(see the scope_func parameter to scoped_session at
http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#sqlalchemy.orm.scoping.scoped_session),
or you could get rid of scoped sessions altogether, and instead create
a session at the beginning of a request and pass it to your handler
functions (or attach it to the Request object).

Simon


On Thu, Sep 5, 2013 at 8:08 AM, herzaso <herz...@gmail.com> wrote:
> Thanks Simon,
> I have one action that is a bit "different" and might be a culprit. All
> other actions are normal with no special asynchronous stuff,
> but before I try to track down this condition, do you think a simple
> session.close at the beginning of each request will do the job?
>
> Thanks,
> Ofir
>
>
> On Tuesday, September 3, 2013 3:54:16 PM UTC+3, Simon King wrote:
>>
>> I don't honestly know, but if this were my project I would be trying
>> very hard to ensure that both (session.commit or session.rollback)
>> followed by session.close were being called at the end of every single
>> request.
>>
>> Are you using any of Tornado's asynchronous capabilities (eg.
>>
>> http://www.tornadoweb.org/en/stable/overview.html#non-blocking-asynchronous-requests)?
>> If so, I don't imagine they would interact very well with a scoped
>> session.
>>
>> On Tue, Sep 3, 2013 at 1:40 PM, Ofir Herzas <her...@gmail.com> wrote:
>> > I have a gateway class (RequestHandler) that all requests get through.
>> > In the handler, I have a try ... finally block, and the session.close
>> > call
>> > is in the finally clause.
>> > Let's say that I have a situation where I don't close the session ...
>> > could
>> > that lead to this issue assuming that I commit changes at the end of
>> > each
>> > request?
>> >
>> > Thanks,
>> > Ofir
>> >
>> > -----Original Message-----
>> > From: sqlal...@googlegroups.com [mailto:sqlal...@googlegroups.com] On
>> > Behalf Of Simon King
>> > Sent: Tuesday, September 03, 2013 3:28 PM
>> > To: sqlal...@googlegroups.com
>> > Subject: Re: [sqlalchemy] Occasional IntegrityError when identifying
>> > model
>> > not by its ID
>> >
>> > OK, I agree that doesn't sound like a race as such. It sounds more like
>> > some
>> > connection state is being shared between requests.
>> >
>> > I haven't used Tornado before and don't know much about it. Its main
>> > feature
>> > is that it uses asynchronous IO, unlike most other python web servers
>> > that
>> > use threads for request handling. How do you make sure that a single
>> > request
>> > uses a single SQLAlchemy session, and that the session gets closed
>> > properly
>> > at the end of the request?
>> >
>> > On Tue, Sep 3, 2013 at 1:08 PM, Ofir Herzas <her...@gmail.com> wrote:
>> >> I Agree, but when there is very little load on the system (e.g. 5
>> >> logged-in
>> >> users)  what are the odds that the same user will get the same error
>> >> 10 times?
>> >>
>> >> On 3 Sep, 2013 3:05 PM, "Simon King" <si...@simonking.org.uk> wrote:
>> >>>
>> >>> Race conditions can happen at any time, not just when the system is
>> >>> under heavy load. You only need 2 requests to arrive at approximately
>> >>> the same time to trigger a race condition.
>> >>>
>> >>> On Tue, Sep 3, 2013 at 12:49 PM, herzaso <her...@gmail.com> wrote:
>> >>> > Another important thing - I don't think it's a race condition - the
>> >>> > load on the system isn't that high and users getting this error
>> >>> > continue to get it all the time
>> >>> >
>> >>> >
>> >>> > On Tuesday, September 3, 2013 2:47:26 PM UTC+3, herzaso wrote:
>> >>> >>
>> >>> >> I've added session.close in my get method just to see if it solves
>> >>> >> the issue and it doesn't!
>> >>> >> I'm so frustrated with this issue... not only do I feel helpless,
>> >>> >> I don't have any clue on how to get around it ...
>> >>> >>
>> >>> >> What if I make the change without the session? Would the session
>> >>> >> pick up the changes on its first query?
>> >>> >>
>> >>> >> On Monday, September 2, 2013 3:58:02 PM UTC+3, Simon King wrote:
>> >>> >>>
>> >>> >>> I'm no expert on isolation levels - I was just trying to help you
>> >>> >>> understand what the problem was :-)
>> >>> >>>
>> >>> >>> Fixing it really depends on how your application is supposed to
>> >>> >>> work in the face of concurrent requests. For this specific part
>> >>> >>> of the application, you probably want to be able to see the Foo
>> >>> >>> object that was created by the other transaction. Reducing the
>> >>> >>> transaction isolation is probably the easiest way to do that, but
>> >>> >>> might have knock-on effects in your application, so you ought to
>> >>> >>> think carefully before doing it.
>> >>> >>>
>> >>> >>> The alternative is to discard the existing session state when you
>> >>> >>> get into this situation (via session.close) and start a new
>> > transaction.
>> >>> >>> However, it wouldn't be appropriate to do this inside your "get"
>> >>> >>> method - "session lifecycle" operations like this really belong
>> >>> >>> at an outer scope, so making a change like this may require a
>> >>> >>> certain amount of restructuring.
>> >>> >>>
>> >>> >>> Basically, dealing with concurrent operations is hard, and
>> >>> >>> SQLAlchemy isn't going to magically make it any easier I'm afraid.
>> >>> >>>
>> >>> >>> Simon
>> >>> >>>
>> >>> >>> On Mon, Sep 2, 2013 at 1:40 PM, herzaso <her...@gmail.com> wrote:
>> >>> >>> > I'm sorry, it was a misunderstanding on my part regarding the
>> >>> >>> > transactions.
>> >>> >>> > So what are you saying? that I should replace the transaction
>> >>> >>> > isolation level?
>> >>> >>> >
>> >>> >>> >
>> >>> >>> > On Monday, September 2, 2013 3:29:25 PM UTC+3, Simon King wrote:
>> >>> >>> >>
>> >>> >>> >> What exactly do you mean by not using transactions? The
>> >>> >>> >> Session always works within a transaction:
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >>
>> >>> >>> >> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#managin
>> >>> >>> >> g-transactions
>> >>> >>> >>
>> >>> >>> >> I assume you are also using InnoDB tables.
>> >>> >>> >>
>> >>> >>> >> On Mon, Sep 2, 2013 at 1:19 PM, herzaso <her...@gmail.com>
>> >>> >>> >> wrote:
>> >>> >>> >> > I do have it set as REPEATABLE READ.
>> >>> >>> >> > However, I don't use transactions in sqlalchemy
>> >>> >>> >> >
>> >>> >>> >> >
>> >>> >>> >> > On Monday, September 2, 2013 3:08:58 PM UTC+3, Simon King
>> >>> >>> >> > wrote:
>> >>> >>> >> >>
>> >>> >>> >> >> Do you know what transaction isolation level you are running
>> > at?
>> >>> >>> >> >> The
>> >>> >>> >> >> default apparently is "REPEATABLE READ":
>> >>> >>> >> >>
>> >>> >>> >> >>
>> >>> >>> >> >>
>> >>> >>> >> >>
>> >>> >>> >> >>
>> >>> >>> >> >> http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html
>> >>> >>> >> >> #isolevel_repeatable-read
>> >>> >>> >> >>
>> >>> >>> >> >> The important sentence in that link is:
>> >>> >>> >> >>
>> >>> >>> >> >>   All consistent reads within the same transaction read the
>> >>> >>> >> >> snapshot established by the first read
>> >>> >>> >> >>
>> >>> >>> >> >> When you query the database for the first time, to see if
>> >>> >>> >> >> the entity already exists, you are setting that initial
>> >>> >>> >> >> snapshot. If you run the same query again (such as in your
>> >>> >>> >> >> exception handler), you will get the same results, whether
>> >>> >>> >> >> or not another connection has inserted a matching row in
>> >>> >>> >> >> the meantime.
>> >>> >>> >> >>
>> >>> >>> >> >> Simon
>> >>> >>> >> >>
>> >>> >>> >> >> On Mon, Sep 2, 2013 at 12:54 PM, herzaso <her...@gmail.com>
>> >>> >>> >> >> wrote:
>> >>> >>> >> >> > I'm not sure what to make of the results:
>> >>> >>> >> >> > On the first connection, I ran BEGIN and INSERT and both
>> >>> >>> >> >> > were successful, but when I tried the INSERT statement on
>> >>> >>> >> >> > the second connection, I got "ERROR
>> >>> >>> >> >> > 1205 (HY000): Lock wait timeout exceeded; try restarting
>> >>> >>> >> >> > transaction".
>> >>> >>> >> >> > Running the same query on the first connection produced
>> >>> >>> >> >> > the required result which is "ERROR 1062 (23000):
>> >>> >>> >> >> > Duplicate entry"
>> >>> >>> >> >> > After the ROLLBACK on the first connection, the INSERT
>> >>> >>> >> >> > statement worked well on the second connection
>> >>> >>> >> >> >
>> >>> >>> >> >> > Regarding your second remark, the answer is yes, the
>> >>> >>> >> >> > error was due to the unique constraint on those columns
>> >>> >>> >> >> >
>> >>> >>> >> >> > BTW: I'm working on MySQL
>> >>> >>> >> >> >
>> >>> >>> >> >> > On Monday, September 2, 2013 1:31:12 PM UTC+3, Simon King
>> >>> >>> >> >> > wrote:
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> I don't really know the answer, but I'd be interested in
>> >>> >>> >> >> >> the results of this experiment:
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> Forget about SQLAlchemy for the moment, and start 2
>> >>> >>> >> >> >> plain SQL connections to your database. In the first,
>> >>> >>> >> >> >> type something like the
>> >>> >>> >> >> >> following:
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> BEGIN;
>> >>> >>> >> >> >> INSERT foo(bar, baz, qux) VALUES(1, 1, 1);
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> Now in the second connection do the same. I assume it'll
>> >>> >>> >> >> >> fail because of the duplicate values.
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> Now in the first connection issue a "ROLLBACK". You
>> >>> >>> >> >> >> should now be in a state where no matching row exists in
>> >>> >>> >> >> >> the database, even though you received an error about
>> >>> >>> >> >> >> constraint violations.
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> The results you see may be different, depending on your
>> >>> >>> >> >> >> transaction isolation level. (It may be that you don't
>> >>> >>> >> >> >> get the constraint violation at all until you try to
>> >>> >>> >> >> >> commit the second connection).
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> Another thing you could look at: are you sure that the
>> >>> >>> >> >> >> error you are getting is due to the unique constraint on
>> >>> >>> >> >> >> bar/baz/qux, and not some other constraint in the
>> >>> >>> >> >> >> database?
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> Simon
>> >>> >>> >> >> >>
>> >>> >>> >> >> >> On Mon, Sep 2, 2013 at 8:45 AM, herzaso
>> >>> >>> >> >> >> <her...@gmail.com>
>> >>> >>> >> >> >> wrote:
>> >>> >>> >> >> >> > I'm afraid it didn't solve my problem.
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> > Here is my updated method:
>> >>> >>> >> >> >> >     @classmethod
>> >>> >>> >> >> >> >     def get(cls, bar=None, baz=None, qux=None,
>> >>> >>> >> >> >> > **kwargs):
>> >>> >>> >> >> >> >         query = session.query(cls).\
>> >>> >>> >> >> >> >             filter(cls.bar == bar).\
>> >>> >>> >> >> >> >             filter(cls.baz == baz).\
>> >>> >>> >> >> >> >             filter(cls.qux == qux)
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> >         item = query.first()
>> >>> >>> >> >> >> >         updated = False
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> >         if not item:
>> >>> >>> >> >> >> >             try:
>> >>> >>> >> >> >> >                 with session.begin_nested():   # run
>> > inside
>> >>> >>> >> >> >> > a
>> >>> >>> >> >> >> > SAVEPOINT
>> >>> >>> >> >> >> >                     updated = True
>> >>> >>> >> >> >> >                     item = cls(bar=bar, baz=baz,
>> >>> >>> >> >> >> > qux=qux,
>> >>> >>> >> >> >> > **kwargs)
>> >>> >>> >> >> >> >                     session.add(item)
>> >>> >>> >> >> >> >                     session.flush()
>> >>> >>> >> >> >> >             except sa.exc.IntegrityError:
>> >>> >>> >> >> >> >                 item = query.first()
>> >>> >>> >> >> >> >                 if not item:
>> >>> >>> >> >> >> >                     raise
>> >>> >>> >> >> >> > Exception("invalidIntegrityError")
>> >>> >>> >> >> >> >             except:
>> >>> >>> >> >> >> >                 raise
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> >         if not updated:
>> >>> >>> >> >> >> >             for k, v in kwargs.iteritems():
>> >>> >>> >> >> >> >                 if getattr(item, k) != v:
>> >>> >>> >> >> >> >                     setattr(item, k, v)
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> >         return item
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> > With this code, i'm getting invalidIntegrityError. How
>> >>> >>> >> >> >> > is it possible?
>> >>> >>> >> >> >> > (it's also worth pointing out that this solution
>> >>> >>> >> >> >> > requires SA
>> >>> >>> >> >> >> > 0.8.2
>> >>> >>> >> >> >> > (otherwise, there is a problem with
>> >>> >>> >> >> >> > session.begin_nested)
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> >
>> >>> >>> >> >> >> > On Tuesday, August 27, 2013 6:40:03 PM UTC+3, Michael
>> >>> >>> >> >> >> > Bayer
>> >>> >>> >> >> >> > wrote:
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> I'm not a fan of catching integrity errors, i prefer
>> >>> >>> >> >> >> >> to try to make sure they aren't going to happen, or
>> >>> >>> >> >> >> >> if they are, they aren't a normal occurrence and the
>> >>> >>> >> >> >> >> system is such that the particular operation can just
>> >>> >>> >> >> >> >> fail (of
>> >>> >>> >> >> >> >> course it depends on what it is).     A problem with
>> >>> >>> >> >> >> >> catching
>> >>> >>> >> >> >> >> the
>> >>> >>> >> >> >> >> integrity
>> >>> >>> >> >> >> >> error due to concurrent, conflicting operations is
>> >>> >>> >> >> >> >> that depending on backend and isolation level, you
>> >>> >>> >> >> >> >> can't be totally sure when the error is going to get
>> >>> >>> >> >> >> >> raised (e.g. serializable isolation vs. non).  Also
>> >>> >>> >> >> >> >> on a backend like Postgresql, the database can't
>> >>> >>> >> >> >> >> recover the transaction after an integrity error
>> >>> >>> >> >> >> >> unless you used a savepoint.
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> But here you're doing the "concurrent transactions
>> >>> >>> >> >> >> >> need row identity X", so maybe it is appropriate
>> >>> >>> >> >> >> >> here.  Here is a rough idea of a transactional
>> >>> >>> >> >> >> >> pattern for that, noting this isn't tested:
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> try:
>> >>> >>> >> >> >> >>     my_object =
>> >>> >>> >> >> >> >> Session.query(MyClass).filter(....).one()
>> >>> >>> >> >> >> >> except NoResultFound:
>> >>> >>> >> >> >> >>     try:
>> >>> >>> >> >> >> >>         with Session.begin_nested():   # run inside a
>> >>> >>> >> >> >> >> SAVEPOINT
>> >>> >>> >> >> >> >>             my_object = MyClass(...)
>> >>> >>> >> >> >> >>             Session.add(my_object)
>> >>> >>> >> >> >> >>             Session.flush()
>> >>> >>> >> >> >> >>     except IntegrityError:
>> >>> >>> >> >> >> >>         my_object =
>> >>> >>> >> >> >> >> Session.query(MyClass).filter(....).one()
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> On Aug 27, 2013, at 11:13 AM, herzaso
>> >>> >>> >> >> >> >> <her...@gmail.com>
>> >>> >>> >> >> >> >> wrote:
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> Suppose we are looking at a race condition, do you
>> >>> >>> >> >> >> >> also think this should be handled by catching the
>> >>> >>> >> >> >> >> IntegrityError?
>> >>> >>> >> >> >> >> If so, what should I do? only flush and do the
>> >>> >>> >> >> >> >> operation again?
>> >>> >>> >> >> >> >>
>> >>> >>> >> >> >> >> On Tuesday, August 27, 2013 5:42:23 PM UTC+3, Michael
>> >>> >>> >> >> >> >> Bayer
>> >>> >>> >> >> >> >> wrote:
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>> the word "occasional" is very meaningful.  It
>> >>> >>> >> >> >> >>> usually suggests race
>> >>> >>> >> >> >> >>> conditions.    Then with the word "tornado", the
>> >>> >>> >> >> >> >>> baysean
>> >>> >>> >> >> >> >>> filters
>> >>> >>> >> >> >> >>> are
>> >>> >>> >> >> >> >>> strongly leaning towards "race condition" at that
>> >>> >>> >> >> >> >>> point :).
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>> if an error is occurring only under volume then you
>> >>> >>> >> >> >> >>> have to revisit where race conditions can occur.
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>> On Aug 27, 2013, at 10:32 AM, herzaso
>> >>> >>> >> >> >> >>> <her...@gmail.com>
>> >>> >>> >> >> >> >>> wrote:
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>> I'm running a Tornado server without redundancy
>> >>> >>> >> >> >> >>> (only one process, requests can arrive at the same
>> >>> >>> >> >> >> >>> time but will be handled one at a
>> >>> >>> >> >> >> >>> time)
>> >>> >>> >> >> >> >>> I do agree that for large volumes, catching the
>> >>> >>> >> >> >> >>> IntegrityError would be better, but currently I am
>> >>> >>> >> >> >> >>> handling a single request at a time and I want to
>> >>> >>> >> >> >> >>> fix this problem before I move on ...
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>>
>> >>> >>> >> >> >> >>> On Tuesday, August 27, 2013 5:24:07 PM UTC+3, Simon
>> >>> >>> >> >> >> >>> King
>> >>> >>> >> >> >> >>> wrote:
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> On Tue, Aug 27, 2013 at 2:31 PM, herzaso
>> >>> >>> >> >> >> >>>> <her...@gmail.com>
>> >>> >>> >> >> >> >>>> wrote:
>> >>> >>> >> >> >> >>>> > On Tuesday, August 27, 2013 3:55:50 PM UTC+3,
>> >>> >>> >> >> >> >>>> > Simon King
>> >>> >>> >> >> >> >>>> > wrote:
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> On Tue, Aug 27, 2013 at 1:40 PM, herzaso
>> >>> >>> >> >> >> >>>> >> <her...@gmail.com>
>> >>> >>> >> >> >> >>>> >> wrote:
>> >>> >>> >> >> >> >>>> >> > I have a model with an ID column set as the
>> >>> >>> >> >> >> >>>> >> > primary key, though i'd like to be able to
>> >>> >>> >> >> >> >>>> >> > identify records by 3 other columns.
>> >>> >>> >> >> >> >>>> >> > For this situation, I've added a classmethod
>> >>> >>> >> >> >> >>>> >> > that will fetch the record if found or a new
>> >>> >>> >> >> >> >>>> >> > record if not.
>> >>> >>> >> >> >> >>>> >> > The problem i'm having is that every once in a
>> >>> >>> >> >> >> >>>> >> > while, I get IntegrityError trying to flush a
>> >>> >>> >> >> >> >>>> >> > change
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> > class Foo(Base):
>> >>> >>> >> >> >> >>>> >> >     __table_args__ =
>> >>> >>> >> >> >> >>>> >> > (sa.UniqueConstraint('bar', 'baz',
>> >>> >>> >> >> >> >>>> >> > 'qux'),)
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> >     id = sa.Column(Identifier,
>> >>> >>> >> >> >> >>>> >> > sa.Sequence('%s_id_seq'
>> >>> >>> >> >> >> >>>> >> > %
>> >>> >>> >> >> >> >>>> >> > __tablename__), nullable=False,
>> >>> >>> >> >> >> >>>> >> > primary_key=True)
>> >>> >>> >> >> >> >>>> >> >     bar = sa.Column(sa.BigInteger)
>> >>> >>> >> >> >> >>>> >> >     baz = sa.Column(sa.BigInteger)
>> >>> >>> >> >> >> >>>> >> >     qux = sa.Column(sa.BigInteger)
>> >>> >>> >> >> >> >>>> >> >     a1 = sa.Column(sa.BigInteger)
>> >>> >>> >> >> >> >>>> >> >     a2 = sa.Column(sa.BigInteger)
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> >     @classmethod
>> >>> >>> >> >> >> >>>> >> >     def get(cls, bar=None, baz=None, qux=None,
>> >>> >>> >> >> >> >>>> >> > **kwargs):
>> >>> >>> >> >> >> >>>> >> >         item = session.query(cls).\
>> >>> >>> >> >> >> >>>> >> >             filter(cls.bar== bar).\
>> >>> >>> >> >> >> >>>> >> >             filter(cls.baz == baz).\
>> >>> >>> >> >> >> >>>> >> >             filter(cls.qux == qux).\
>> >>> >>> >> >> >> >>>> >> >             first()
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> >         if item:
>> >>> >>> >> >> >> >>>> >> >             for k, v in kwargs.iteritems():
>> >>> >>> >> >> >> >>>> >> >                 if getattr(item, k) != v:
>> >>> >>> >> >> >> >>>> >> >                     setattr(item, k, v)
>> >>> >>> >> >> >> >>>> >> >         else:
>> >>> >>> >> >> >> >>>> >> >             item = cls(bar=bar, baz=baz,
>> >>> >>> >> >> >> >>>> >> > qux=qux,
>> >>> >>> >> >> >> >>>> >> > **kwargs)
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> >         return item
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> > This is the code I use to add/update records:
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> > foo = Foo.get(**item)
>> >>> >>> >> >> >> >>>> >> > session.merge(foo)
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >> > I'm struggling with this problem for some time
>> >>> >>> >> >> >> >>>> >> > now, and would appreciate any help ...
>> >>> >>> >> >> >> >>>> >> >
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> I'm not sure of the exact problem, but there are
>> >>> >>> >> >> >> >>>> >> a couple of things that you could investigate.
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> Firstly, session.merge returns a copy of the
>> >>> >>> >> >> >> >>>> >> object, rather than adding the object that you
>> >>> >>> >> >> >> >>>> >> supplied into the session.
>> >>> >>> >> >> >> >>>> >> See
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> http://docs.sqlalchemy.org/en/rel_0_8/orm/sessio
>> >>> >>> >> >> >> >>>> >> n.html#merging
>> >>> >>> >> >> >> >>>> >> for
>> >>> >>> >> >> >> >>>> >> details.
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> Secondly, your "get" method sometimes returns
>> >>> >>> >> >> >> >>>> >> objects that are already part of the session (if
>> >>> >>> >> >> >> >>>> >> they were in the database), and sometimes
>> >>> >>> >> >> >> >>>> >> objects that are not in the session. It would
>> >>> >>> >> >> >> >>>> >> probably be more consistent to always return
>> >>> >>> >> >> >> >>>> >> objects that are part of the session, by putting
>> >>> >>> >> >> >> >>>> >> "session.add(item)" in your "else" clause.
>> >>> >>> >> >> >> >>>> >> This
>> >>> >>> >> >> >> >>>> >> would
>> >>> >>> >> >> >> >>>> >> get
>> >>> >>> >> >> >> >>>> >> rid
>> >>> >>> >> >> >> >>>> >> of the need for session.merge(). (If you want to
>> >>> >>> >> >> >> >>>> >> be able to use the "get" with non-global
>> >>> >>> >> >> >> >>>> >> sessions, pass the session as a
>> >>> >>> >> >> >> >>>> >> parameter.)
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> Finally, if your session isn't auto-flushing, it
>> >>> >>> >> >> >> >>>> >> would be possible for you to call "get" twice
>> >>> >>> >> >> >> >>>> >> with the same parameters and get
>> >>> >>> >> >> >> >>>> >> 2
>> >>> >>> >> >> >> >>>> >> different
>> >>> >>> >> >> >> >>>> >> objects back.
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> You may want to look at the UniqueObject recipe
>> >>> >>> >> >> >> >>>> >> in the
>> >>> >>> >> >> >> >>>> >> wiki:
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> >> http://www.sqlalchemy.org/trac/wiki/UsageRecipes
>> >>> >>> >> >> >> >>>> >> /UniqueObject
>> >>> >>> >> >> >> >>>> >>
>> >>> >>> >> >> >> >>>> > Hi Simon,
>> >>> >>> >> >> >> >>>> > Thanks for the fast reply.
>> >>> >>> >> >> >> >>>> >
>> >>> >>> >> >> >> >>>> > I tried adding session.add(item) and
>> >>> >>> >> >> >> >>>> > session.flush() in the else clause in the past
>> >>> >>> >> >> >> >>>> > but that didn't solve my problem.
>> >>> >>> >> >> >> >>>> > I didn't however remove the merge, do you think
>> >>> >>> >> >> >> >>>> > that might be the problem?
>> >>> >>> >> >> >> >>>> >
>> >>> >>> >> >> >> >>>> > Regarding the flush, this code is part of an API
>> >>> >>> >> >> >> >>>> > server where a scoped_session is committed after
>> >>> >>> >> >> >> >>>> > each change. I haven't changed the autoflush
>> >>> >>> >> >> >> >>>> > parameter, and as I understand the default value
>> >>> >>> >> >> >> >>>> > is True making a flush before each commit or
>> >>> >>> >> >> >> >>>> > query.
>> >>> >>> >> >> >> >>>> >
>> >>> >>> >> >> >> >>>> > As for the UniqueObject recipe, thanks! Amazing
>> >>> >>> >> >> >> >>>> > that I never found it searching for a cure. As I
>> >>> >>> >> >> >> >>>> > see it basically does the same ...
>> >>> >>> >> >> >> >>>> >
>> >>> >>> >> >> >> >>>> > I never managed to reproduce this bug on my
>> >>> >>> >> >> >> >>>> > development environment.
>> >>> >>> >> >> >> >>>> > It only
>> >>> >>> >> >> >> >>>> > happens in my production environment.
>> >>> >>> >> >> >> >>>> > Do you suppose adding a session.add and removing
>> >>> >>> >> >> >> >>>> > the merge will solve this issue?
>> >>> >>> >> >> >> >>>> >
>> >>> >>> >> >> >> >>>> > Thanks,
>> >>> >>> >> >> >> >>>> > Ofir
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> It's difficult to say without knowing more about
>> >>> >>> >> >> >> >>>> your system.
>> >>> >>> >> >> >> >>>> For
>> >>> >>> >> >> >> >>>> example, does your production system get multiple
>> >>> >>> >> >> >> >>>> concurrent API requests, or are they serialised? If
>> >>> >>> >> >> >> >>>> 2 requests can come in at approximately the same
>> >>> >>> >> >> >> >>>> time and are handled by 2 different threads (or
>> >>> >>> >> >> >> >>>> processes), then it is easy to imagine that the
>> >>> >>> >> >> >> >>>> first handler will check the database, find that an
>> >>> >>> >> >> >> >>>> entry doesn't exist, and create it.
>> >>> >>> >> >> >> >>>> But before it flushes the change to the database
>> >>> >>> >> >> >> >>>> (or even after it flushes, but before it commits,
>> >>> >>> >> >> >> >>>> depending on your transaction isolation), the
>> >>> >>> >> >> >> >>>> second handler will check for the same object, find
>> >>> >>> >> >> >> >>>> it missing, and so create it.
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> To track down problems like this, you could ensure
>> >>> >>> >> >> >> >>>> that your development environment has the same
>> >>> >>> >> >> >> >>>> thread/process behaviour as the production
>> >>> >>> >> >> >> >>>> environment, then try submitting multiple
>> >>> >>> >> >> >> >>>> concurrent requests to it. If you add "time.sleep"
>> >>> >>> >> >> >> >>>> statements somewhere between the creation of the
>> >>> >>> >> >> >> >>>> object and the commit of the transaction you will
>> >>> >>> >> >> >> >>>> probably find it easier to trigger.
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> To actually fix the problem, you could choose to
>> >>> >>> >> >> >> >>>> only handle a single request at a time (fine if you
>> >>> >>> >> >> >> >>>> don't expect a high volume of requests). If that's
>> >>> >>> >> >> >> >>>> not acceptable, you could catch the IntegrityError
>> >>> >>> >> >> >> >>>> and then re-process the request.
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> Hope that helps,
>> >>> >>> >> >> >> >>>>
>> >>> >>> >> >> >> >>>> Simon
>> >>> >>> >> >> >> >>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Reply via email to