Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Ladislav Lenart Tue, 03 Sep 2013 05:46:55 -0700

On 3.9.2013 14:28, Simon King wrote:
> OK, I agree that doesn't sound like a race as such. It sounds more
> like some connection state is being shared between requests.


That might be it. We had a very similar bug: We stored ORM objects in the module
level and did not imediately realize that these are shared among web requests
and db sessions.

Ladislav Lenart


> I haven't used Tornado before and don't know much about it. Its main
> feature is that it uses asynchronous IO, unlike most other python web
> servers that use threads for request handling. How do you make sure
> that a single request uses a single SQLAlchemy session, and that the
> session gets closed properly at the end of the request?
> 
> On Tue, Sep 3, 2013 at 1:08 PM, Ofir Herzas <herz...@gmail.com> wrote:
>> I Agree, but when there is very little load on the system (e.g. 5 logged-in
>> users)  what are the odds that the same user will get the same error 10
>> times?
>>
>> On 3 Sep, 2013 3:05 PM, "Simon King" <si...@simonking.org.uk> wrote:
>>>
>>> Race conditions can happen at any time, not just when the system is
>>> under heavy load. You only need 2 requests to arrive at approximately
>>> the same time to trigger a race condition.
>>>
>>> On Tue, Sep 3, 2013 at 12:49 PM, herzaso <herz...@gmail.com> wrote:
>>>> Another important thing - I don't think it's a race condition - the load
>>>> on
>>>> the system isn't that high and users getting this error continue to get
>>>> it
>>>> all the time
>>>>
>>>>
>>>> On Tuesday, September 3, 2013 2:47:26 PM UTC+3, herzaso wrote:
>>>>>
>>>>> I've added session.close in my get method just to see if it solves the
>>>>> issue and it doesn't!
>>>>> I'm so frustrated with this issue... not only do I feel helpless, I
>>>>> don't
>>>>> have any clue on how to get around it ...
>>>>>
>>>>> What if I make the change without the session? Would the session pick
>>>>> up
>>>>> the changes on its first query?
>>>>>
>>>>> On Monday, September 2, 2013 3:58:02 PM UTC+3, Simon King wrote:
>>>>>>
>>>>>> I'm no expert on isolation levels - I was just trying to help you
>>>>>> understand what the problem was :-)
>>>>>>
>>>>>> Fixing it really depends on how your application is supposed to work
>>>>>> in the face of concurrent requests. For this specific part of the
>>>>>> application, you probably want to be able to see the Foo object that
>>>>>> was created by the other transaction. Reducing the transaction
>>>>>> isolation is probably the easiest way to do that, but might have
>>>>>> knock-on effects in your application, so you ought to think carefully
>>>>>> before doing it.
>>>>>>
>>>>>> The alternative is to discard the existing session state when you get
>>>>>> into this situation (via session.close) and start a new transaction.
>>>>>> However, it wouldn't be appropriate to do this inside your "get"
>>>>>> method - "session lifecycle" operations like this really belong at an
>>>>>> outer scope, so making a change like this may require a certain amount
>>>>>> of restructuring.
>>>>>>
>>>>>> Basically, dealing with concurrent operations is hard, and SQLAlchemy
>>>>>> isn't going to magically make it any easier I'm afraid.
>>>>>>
>>>>>> Simon
>>>>>>
>>>>>> On Mon, Sep 2, 2013 at 1:40 PM, herzaso <her...@gmail.com> wrote:
>>>>>>> I'm sorry, it was a misunderstanding on my part regarding the
>>>>>>> transactions.
>>>>>>> So what are you saying? that I should replace the transaction
>>>>>>> isolation
>>>>>>> level?
>>>>>>>
>>>>>>>
>>>>>>> On Monday, September 2, 2013 3:29:25 PM UTC+3, Simon King wrote:
>>>>>>>>
>>>>>>>> What exactly do you mean by not using transactions? The Session
>>>>>>>> always
>>>>>>>> works within a transaction:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#managing-transactions
>>>>>>>>
>>>>>>>> I assume you are also using InnoDB tables.
>>>>>>>>
>>>>>>>> On Mon, Sep 2, 2013 at 1:19 PM, herzaso <her...@gmail.com> wrote:
>>>>>>>>> I do have it set as REPEATABLE READ.
>>>>>>>>> However, I don't use transactions in sqlalchemy
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Monday, September 2, 2013 3:08:58 PM UTC+3, Simon King wrote:
>>>>>>>>>>
>>>>>>>>>> Do you know what transaction isolation level you are running at?
>>>>>>>>>> The
>>>>>>>>>> default apparently is "REPEATABLE READ":
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html#isolevel_repeatable-read
>>>>>>>>>>
>>>>>>>>>> The important sentence in that link is:
>>>>>>>>>>
>>>>>>>>>>   All consistent reads within the same transaction read the
>>>>>>>>>> snapshot
>>>>>>>>>> established by the first read
>>>>>>>>>>
>>>>>>>>>> When you query the database for the first time, to see if the
>>>>>>>>>> entity
>>>>>>>>>> already exists, you are setting that initial snapshot. If you
>>>>>>>>>> run
>>>>>>>>>> the
>>>>>>>>>> same query again (such as in your exception handler), you will
>>>>>>>>>> get
>>>>>>>>>> the
>>>>>>>>>> same results, whether or not another connection has inserted a
>>>>>>>>>> matching row in the meantime.
>>>>>>>>>>
>>>>>>>>>> Simon
>>>>>>>>>>
>>>>>>>>>> On Mon, Sep 2, 2013 at 12:54 PM, herzaso <her...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>> I'm not sure what to make of the results:
>>>>>>>>>>> On the first connection, I ran BEGIN and INSERT and both were
>>>>>>>>>>> successful,
>>>>>>>>>>> but when I tried the INSERT statement on the second
>>>>>>>>>>> connection, I
>>>>>>>>>>> got
>>>>>>>>>>> "ERROR
>>>>>>>>>>> 1205 (HY000): Lock wait timeout exceeded; try restarting
>>>>>>>>>>> transaction".
>>>>>>>>>>> Running the same query on the first connection produced the
>>>>>>>>>>> required
>>>>>>>>>>> result
>>>>>>>>>>> which is "ERROR 1062 (23000): Duplicate entry"
>>>>>>>>>>> After the ROLLBACK on the first connection, the INSERT
>>>>>>>>>>> statement
>>>>>>>>>>> worked
>>>>>>>>>>> well
>>>>>>>>>>> on the second connection
>>>>>>>>>>>
>>>>>>>>>>> Regarding your second remark, the answer is yes, the error was
>>>>>>>>>>> due to
>>>>>>>>>>> the
>>>>>>>>>>> unique constraint on those columns
>>>>>>>>>>>
>>>>>>>>>>> BTW: I'm working on MySQL
>>>>>>>>>>>
>>>>>>>>>>> On Monday, September 2, 2013 1:31:12 PM UTC+3, Simon King
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> I don't really know the answer, but I'd be interested in the
>>>>>>>>>>>> results
>>>>>>>>>>>> of this experiment:
>>>>>>>>>>>>
>>>>>>>>>>>> Forget about SQLAlchemy for the moment, and start 2 plain SQL
>>>>>>>>>>>> connections to your database. In the first, type something
>>>>>>>>>>>> like
>>>>>>>>>>>> the
>>>>>>>>>>>> following:
>>>>>>>>>>>>
>>>>>>>>>>>> BEGIN;
>>>>>>>>>>>> INSERT foo(bar, baz, qux) VALUES(1, 1, 1);
>>>>>>>>>>>>
>>>>>>>>>>>> Now in the second connection do the same. I assume it'll fail
>>>>>>>>>>>> because
>>>>>>>>>>>> of the duplicate values.
>>>>>>>>>>>>
>>>>>>>>>>>> Now in the first connection issue a "ROLLBACK". You should
>>>>>>>>>>>> now
>>>>>>>>>>>> be in
>>>>>>>>>>>> a
>>>>>>>>>>>> state where no matching row exists in the database, even
>>>>>>>>>>>> though
>>>>>>>>>>>> you
>>>>>>>>>>>> received an error about constraint violations.
>>>>>>>>>>>>
>>>>>>>>>>>> The results you see may be different, depending on your
>>>>>>>>>>>> transaction
>>>>>>>>>>>> isolation level. (It may be that you don't get the constraint
>>>>>>>>>>>> violation at all until you try to commit the second
>>>>>>>>>>>> connection).
>>>>>>>>>>>>
>>>>>>>>>>>> Another thing you could look at: are you sure that the error
>>>>>>>>>>>> you
>>>>>>>>>>>> are
>>>>>>>>>>>> getting is due to the unique constraint on bar/baz/qux, and
>>>>>>>>>>>> not
>>>>>>>>>>>> some
>>>>>>>>>>>> other constraint in the database?
>>>>>>>>>>>>
>>>>>>>>>>>> Simon
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Sep 2, 2013 at 8:45 AM, herzaso <her...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I'm afraid it didn't solve my problem.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is my updated method:
>>>>>>>>>>>>>     @classmethod
>>>>>>>>>>>>>     def get(cls, bar=None, baz=None, qux=None, **kwargs):
>>>>>>>>>>>>>         query = session.query(cls).\
>>>>>>>>>>>>>             filter(cls.bar == bar).\
>>>>>>>>>>>>>             filter(cls.baz == baz).\
>>>>>>>>>>>>>             filter(cls.qux == qux)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         item = query.first()
>>>>>>>>>>>>>         updated = False
>>>>>>>>>>>>>
>>>>>>>>>>>>>         if not item:
>>>>>>>>>>>>>             try:
>>>>>>>>>>>>>                 with session.begin_nested():   # run inside
>>>>>>>>>>>>> a
>>>>>>>>>>>>> SAVEPOINT
>>>>>>>>>>>>>                     updated = True
>>>>>>>>>>>>>                     item = cls(bar=bar, baz=baz, qux=qux,
>>>>>>>>>>>>> **kwargs)
>>>>>>>>>>>>>                     session.add(item)
>>>>>>>>>>>>>                     session.flush()
>>>>>>>>>>>>>             except sa.exc.IntegrityError:
>>>>>>>>>>>>>                 item = query.first()
>>>>>>>>>>>>>                 if not item:
>>>>>>>>>>>>>                     raise
>>>>>>>>>>>>> Exception("invalidIntegrityError")
>>>>>>>>>>>>>             except:
>>>>>>>>>>>>>                 raise
>>>>>>>>>>>>>
>>>>>>>>>>>>>         if not updated:
>>>>>>>>>>>>>             for k, v in kwargs.iteritems():
>>>>>>>>>>>>>                 if getattr(item, k) != v:
>>>>>>>>>>>>>                     setattr(item, k, v)
>>>>>>>>>>>>>
>>>>>>>>>>>>>         return item
>>>>>>>>>>>>>
>>>>>>>>>>>>> With this code, i'm getting invalidIntegrityError. How is
>>>>>>>>>>>>> it
>>>>>>>>>>>>> possible?
>>>>>>>>>>>>> (it's also worth pointing out that this solution requires
>>>>>>>>>>>>> SA
>>>>>>>>>>>>> 0.8.2
>>>>>>>>>>>>> (otherwise, there is a problem with session.begin_nested)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tuesday, August 27, 2013 6:40:03 PM UTC+3, Michael Bayer
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm not a fan of catching integrity errors, i prefer to
>>>>>>>>>>>>>> try
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> make
>>>>>>>>>>>>>> sure
>>>>>>>>>>>>>> they aren't going to happen, or if they are, they aren't a
>>>>>>>>>>>>>> normal
>>>>>>>>>>>>>> occurrence
>>>>>>>>>>>>>> and the system is such that the particular operation can
>>>>>>>>>>>>>> just
>>>>>>>>>>>>>> fail
>>>>>>>>>>>>>> (of
>>>>>>>>>>>>>> course it depends on what it is).     A problem with
>>>>>>>>>>>>>> catching
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> integrity
>>>>>>>>>>>>>> error due to concurrent, conflicting operations is that
>>>>>>>>>>>>>> depending
>>>>>>>>>>>>>> on
>>>>>>>>>>>>>> backend
>>>>>>>>>>>>>> and isolation level, you can't be totally sure when the
>>>>>>>>>>>>>> error
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> going
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> get raised (e.g. serializable isolation vs. non).  Also on
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> backend
>>>>>>>>>>>>>> like
>>>>>>>>>>>>>> Postgresql, the database can't recover the transaction
>>>>>>>>>>>>>> after
>>>>>>>>>>>>>> an
>>>>>>>>>>>>>> integrity
>>>>>>>>>>>>>> error unless you used a savepoint.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But here you're doing the "concurrent transactions need
>>>>>>>>>>>>>> row
>>>>>>>>>>>>>> identity
>>>>>>>>>>>>>> X",
>>>>>>>>>>>>>> so maybe it is appropriate here.  Here is a rough idea of
>>>>>>>>>>>>>> a
>>>>>>>>>>>>>> transactional
>>>>>>>>>>>>>> pattern for that, noting this isn't tested:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> try:
>>>>>>>>>>>>>>     my_object = Session.query(MyClass).filter(....).one()
>>>>>>>>>>>>>> except NoResultFound:
>>>>>>>>>>>>>>     try:
>>>>>>>>>>>>>>         with Session.begin_nested():   # run inside a
>>>>>>>>>>>>>> SAVEPOINT
>>>>>>>>>>>>>>             my_object = MyClass(...)
>>>>>>>>>>>>>>             Session.add(my_object)
>>>>>>>>>>>>>>             Session.flush()
>>>>>>>>>>>>>>     except IntegrityError:
>>>>>>>>>>>>>>         my_object =
>>>>>>>>>>>>>> Session.query(MyClass).filter(....).one()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 27, 2013, at 11:13 AM, herzaso <her...@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Suppose we are looking at a race condition, do you also
>>>>>>>>>>>>>> think
>>>>>>>>>>>>>> this
>>>>>>>>>>>>>> should
>>>>>>>>>>>>>> be handled by catching the IntegrityError?
>>>>>>>>>>>>>> If so, what should I do? only flush and do the operation
>>>>>>>>>>>>>> again?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tuesday, August 27, 2013 5:42:23 PM UTC+3, Michael
>>>>>>>>>>>>>> Bayer
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the word "occasional" is very meaningful.  It usually
>>>>>>>>>>>>>>> suggests
>>>>>>>>>>>>>>> race
>>>>>>>>>>>>>>> conditions.    Then with the word "tornado", the baysean
>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>> strongly leaning towards "race condition" at that point
>>>>>>>>>>>>>>> :).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> if an error is occurring only under volume then you have
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> revisit
>>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>> race conditions can occur.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Aug 27, 2013, at 10:32 AM, herzaso <her...@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm running a Tornado server without redundancy (only one
>>>>>>>>>>>>>>> process,
>>>>>>>>>>>>>>> requests can arrive at the same time but will be handled
>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>> time)
>>>>>>>>>>>>>>> I do agree that for large volumes, catching the
>>>>>>>>>>>>>>> IntegrityError
>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>> better, but currently I am handling a single request at a
>>>>>>>>>>>>>>> time
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>> want to
>>>>>>>>>>>>>>> fix this problem before I move on ...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tuesday, August 27, 2013 5:24:07 PM UTC+3, Simon King
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Aug 27, 2013 at 2:31 PM, herzaso
>>>>>>>>>>>>>>>> <her...@gmail.com>
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> On Tuesday, August 27, 2013 3:55:50 PM UTC+3, Simon
>>>>>>>>>>>>>>>>> King
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Aug 27, 2013 at 1:40 PM, herzaso
>>>>>>>>>>>>>>>>>> <her...@gmail.com>
>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> I have a model with an ID column set as the primary
>>>>>>>>>>>>>>>>>>> key,
>>>>>>>>>>>>>>>>>>> though
>>>>>>>>>>>>>>>>>>> i'd
>>>>>>>>>>>>>>>>>>> like
>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> be able to identify records by 3 other columns.
>>>>>>>>>>>>>>>>>>> For this situation, I've added a classmethod that
>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>> fetch
>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> record
>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>> found or a new record if not.
>>>>>>>>>>>>>>>>>>> The problem i'm having is that every once in a
>>>>>>>>>>>>>>>>>>> while,
>>>>>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>> IntegrityError
>>>>>>>>>>>>>>>>>>> trying to flush a change
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> class Foo(Base):
>>>>>>>>>>>>>>>>>>>     __table_args__ = (sa.UniqueConstraint('bar',
>>>>>>>>>>>>>>>>>>> 'baz',
>>>>>>>>>>>>>>>>>>> 'qux'),)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     id = sa.Column(Identifier,
>>>>>>>>>>>>>>>>>>> sa.Sequence('%s_id_seq'
>>>>>>>>>>>>>>>>>>> %
>>>>>>>>>>>>>>>>>>> __tablename__),
>>>>>>>>>>>>>>>>>>> nullable=False, primary_key=True)
>>>>>>>>>>>>>>>>>>>     bar = sa.Column(sa.BigInteger)
>>>>>>>>>>>>>>>>>>>     baz = sa.Column(sa.BigInteger)
>>>>>>>>>>>>>>>>>>>     qux = sa.Column(sa.BigInteger)
>>>>>>>>>>>>>>>>>>>     a1 = sa.Column(sa.BigInteger)
>>>>>>>>>>>>>>>>>>>     a2 = sa.Column(sa.BigInteger)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>     @classmethod
>>>>>>>>>>>>>>>>>>>     def get(cls, bar=None, baz=None, qux=None,
>>>>>>>>>>>>>>>>>>> **kwargs):
>>>>>>>>>>>>>>>>>>>         item = session.query(cls).\
>>>>>>>>>>>>>>>>>>>             filter(cls.bar== bar).\
>>>>>>>>>>>>>>>>>>>             filter(cls.baz == baz).\
>>>>>>>>>>>>>>>>>>>             filter(cls.qux == qux).\
>>>>>>>>>>>>>>>>>>>             first()
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         if item:
>>>>>>>>>>>>>>>>>>>             for k, v in kwargs.iteritems():
>>>>>>>>>>>>>>>>>>>                 if getattr(item, k) != v:
>>>>>>>>>>>>>>>>>>>                     setattr(item, k, v)
>>>>>>>>>>>>>>>>>>>         else:
>>>>>>>>>>>>>>>>>>>             item = cls(bar=bar, baz=baz, qux=qux,
>>>>>>>>>>>>>>>>>>> **kwargs)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>         return item
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This is the code I use to add/update records:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> foo = Foo.get(**item)
>>>>>>>>>>>>>>>>>>> session.merge(foo)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm struggling with this problem for some time now,
>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>> appreciate
>>>>>>>>>>>>>>>>>>> any
>>>>>>>>>>>>>>>>>>> help ...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm not sure of the exact problem, but there are a
>>>>>>>>>>>>>>>>>> couple of
>>>>>>>>>>>>>>>>>> things
>>>>>>>>>>>>>>>>>> that you could investigate.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Firstly, session.merge returns a copy of the object,
>>>>>>>>>>>>>>>>>> rather
>>>>>>>>>>>>>>>>>> than
>>>>>>>>>>>>>>>>>> adding the object that you supplied into the session.
>>>>>>>>>>>>>>>>>> See
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#merging
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> details.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Secondly, your "get" method sometimes returns objects
>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>> already
>>>>>>>>>>>>>>>>>> part of the session (if they were in the database),
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> sometimes
>>>>>>>>>>>>>>>>>> objects that are not in the session. It would
>>>>>>>>>>>>>>>>>> probably
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> more
>>>>>>>>>>>>>>>>>> consistent to always return objects that are part of
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> session,
>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>> putting "session.add(item)" in your "else" clause.
>>>>>>>>>>>>>>>>>> This
>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>> rid
>>>>>>>>>>>>>>>>>> of the need for session.merge(). (If you want to be
>>>>>>>>>>>>>>>>>> able
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> use
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> "get" with non-global sessions, pass the session as a
>>>>>>>>>>>>>>>>>> parameter.)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Finally, if your session isn't auto-flushing, it
>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> possible
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> you to call "get" twice with the same parameters and
>>>>>>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>> objects back.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You may want to look at the UniqueObject recipe in
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> wiki:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Simon,
>>>>>>>>>>>>>>>>> Thanks for the fast reply.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I tried adding session.add(item) and session.flush()
>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> else
>>>>>>>>>>>>>>>>> clause in
>>>>>>>>>>>>>>>>> the past but that didn't solve my problem.
>>>>>>>>>>>>>>>>> I didn't however remove the merge, do you think that
>>>>>>>>>>>>>>>>> might be
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> problem?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regarding the flush, this code is part of an API
>>>>>>>>>>>>>>>>> server
>>>>>>>>>>>>>>>>> where
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> scoped_session is committed after each change. I
>>>>>>>>>>>>>>>>> haven't
>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> autoflush parameter, and as I understand the default
>>>>>>>>>>>>>>>>> value is
>>>>>>>>>>>>>>>>> True
>>>>>>>>>>>>>>>>> making a
>>>>>>>>>>>>>>>>> flush before each commit or query.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As for the UniqueObject recipe, thanks! Amazing that I
>>>>>>>>>>>>>>>>> never
>>>>>>>>>>>>>>>>> found
>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> searching for a cure. As I see it basically does the
>>>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>> ...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I never managed to reproduce this bug on my
>>>>>>>>>>>>>>>>> development
>>>>>>>>>>>>>>>>> environment.
>>>>>>>>>>>>>>>>> It only
>>>>>>>>>>>>>>>>> happens in my production environment.
>>>>>>>>>>>>>>>>> Do you suppose adding a session.add and removing the
>>>>>>>>>>>>>>>>> merge
>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>> solve
>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>> issue?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Ofir
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It's difficult to say without knowing more about your
>>>>>>>>>>>>>>>> system.
>>>>>>>>>>>>>>>> For
>>>>>>>>>>>>>>>> example, does your production system get multiple
>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>> API
>>>>>>>>>>>>>>>> requests, or are they serialised? If 2 requests can come
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>> approximately the same time and are handled by 2
>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>> threads
>>>>>>>>>>>>>>>> (or
>>>>>>>>>>>>>>>> processes), then it is easy to imagine that the first
>>>>>>>>>>>>>>>> handler
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> check the database, find that an entry doesn't exist,
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> create
>>>>>>>>>>>>>>>> it.
>>>>>>>>>>>>>>>> But before it flushes the change to the database (or
>>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>> after
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>> flushes, but before it commits, depending on your
>>>>>>>>>>>>>>>> transaction
>>>>>>>>>>>>>>>> isolation), the second handler will check for the same
>>>>>>>>>>>>>>>> object,
>>>>>>>>>>>>>>>> find
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>> missing, and so create it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To track down problems like this, you could ensure that
>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>> development environment has the same thread/process
>>>>>>>>>>>>>>>> behaviour
>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> production environment, then try submitting multiple
>>>>>>>>>>>>>>>> concurrent
>>>>>>>>>>>>>>>> requests to it. If you add "time.sleep" statements
>>>>>>>>>>>>>>>> somewhere
>>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>> the creation of the object and the commit of the
>>>>>>>>>>>>>>>> transaction
>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>> probably find it easier to trigger.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To actually fix the problem, you could choose to only
>>>>>>>>>>>>>>>> handle a
>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>> request at a time (fine if you don't expect a high
>>>>>>>>>>>>>>>> volume
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>> requests). If that's not acceptable, you could catch the
>>>>>>>>>>>>>>>> IntegrityError and then re-process the request.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hope that helps,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Simon
>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>> Google
>>>>>>>>> Groups
>>>>>>>>> "sqlalchemy" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>> send
>>>>>>>>> an
>>>>>>>>> email to sqlalchemy+...@googlegroups.com.
>>>>>>>>> To post to this group, send email to sqlal...@googlegroups.com.
>>>>>>>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups
>>>>>>> "sqlalchemy" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send
>>>>>>> an
>>>>>>> email to sqlalchemy+...@googlegroups.com.
>>>>>>> To post to this group, send email to sqlal...@googlegroups.com.
>>>>>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups
>>>> "sqlalchemy" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an
>>>> email to sqlalchemy+unsubscr...@googlegroups.com.
>>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to sqlalchemy+unsubscr...@googlegroups.com.
>>> To post to this group, send email to sqlalchemy@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/sqlalchemy.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "sqlalchemy" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to sqlalchemy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Reply via email to