Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Simon King Fri, 06 Sep 2013 15:50:12 -0700

The only other suggestion I've got is to enable DEBUG-level logging (either via 
echo="debug" when creating the engine, or using the techniques on 
http://docs.sqlalchemy.org/en/rel_0_8/core/engines.html#configuring-logging). 
That should enable you to see exactly which SQL statements are being sent on 
which connections. When the error occurs, look at all the SQL sent on that 
connection and perhaps you will see what is causing the problem.



On 6 Sep 2013, at 21:42, herzaso <herz...@gmail.com> wrote:

> Hi Simon,
> I've removed the scoped_session altogether but the problem still persists.
> 
> Several important things:
> 1. The problem occurs even after I restart my service
> 2. Previously, when I talked to one of my clients who had this issue, I asked 
> her to check it on a different computer and it was OK (I can't figure it out)
> 3. This issue happens only for 1 model (out of about 20)
> 4. I remember reading somewhere that SA uses PK's to keep the records, and 
> since I didn't use the PK to identify the records on this model, I thought it 
> may be connected.
> 5. I was never able to reproduce this bug. I only get it through the 
> exceptions in my production evnironment
> 
> Any other thoughts? Can you make sense of this issue being related to certain 
> computers?
> 
> Thanks,
> Ofir
> 
> On Thursday, September 5, 2013 1:13:49 PM UTC+3, Simon King wrote:
> (Remember, I still know next to nothing about Tornado, so this is pure 
> assumption on my part) 
> 
> By default, SQLAlchemy scoped sessions use thread-local variables to 
> ensure that each thread gets its own session. Tornado's asynchronous 
> features process multiple requests "simultaneously" on the same 
> thread. So consider this set of steps 
> 
> 1. Request 1 comes in for your "special" asynchronous action. Since 
> this is the first request, a new thread-local session gets created. 
> 
> 2. Your special action does whatever it is that causes it to pause and 
> yield control back to Tornado 
> 
> 3. Request 2 comes in for a normal synchronous action. Tornado handles 
> it on the same thread that processed Request 1. Since the session is 
> still open from Request 1, it gets re-used, including whatever state 
> was set up in Request 1. 
> 
> 4. At the end of Request 2, you call session.commit/rollback/close 
> 
> 5. Tornado hands control back to the Request 1 handler, which now 
> checks out a new session (because its previous one has been closed). 
> Any state that Request 1 had previously put in the session will have 
> been committed or rolled back at the end of Request 2 
> 
> Putting a "session.close" at the beginning of every request won't 
> really fix the problem. It just means that in the example above, 
> Request 2 would throw away whatever Request 1 was doing. 
> 
> I don't think you can use threadlocal scoped sessions with 
> asynchronous tornado code. You could use a different scoping mechanism 
> (see the scope_func parameter to scoped_session at 
> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#sqlalchemy.orm.scoping.scoped_session),
>  
> or you could get rid of scoped sessions altogether, and instead create 
> a session at the beginning of a request and pass it to your handler 
> functions (or attach it to the Request object). 
> 
> Simon 
> 
> 
> On Thu, Sep 5, 2013 at 8:08 AM, herzaso <her...@gmail.com> wrote: 
> > Thanks Simon, 
> > I have one action that is a bit "different" and might be a culprit. All 
> > other actions are normal with no special asynchronous stuff, 
> > but before I try to track down this condition, do you think a simple 
> > session.close at the beginning of each request will do the job? 
> > 
> > Thanks, 
> > Ofir 
> > 
> > 
> > On Tuesday, September 3, 2013 3:54:16 PM UTC+3, Simon King wrote: 
> >> 
> >> I don't honestly know, but if this were my project I would be trying 
> >> very hard to ensure that both (session.commit or session.rollback) 
> >> followed by session.close were being called at the end of every single 
> >> request. 
> >> 
> >> Are you using any of Tornado's asynchronous capabilities (eg. 
> >> 
> >> http://www.tornadoweb.org/en/stable/overview.html#non-blocking-asynchronous-requests)?
> >>  
> >> If so, I don't imagine they would interact very well with a scoped 
> >> session. 
> >> 
> >> On Tue, Sep 3, 2013 at 1:40 PM, Ofir Herzas <her...@gmail.com> wrote: 
> >> > I have a gateway class (RequestHandler) that all requests get through. 
> >> > In the handler, I have a try ... finally block, and the session.close 
> >> > call 
> >> > is in the finally clause. 
> >> > Let's say that I have a situation where I don't close the session ... 
> >> > could 
> >> > that lead to this issue assuming that I commit changes at the end of 
> >> > each 
> >> > request? 
> >> > 
> >> > Thanks, 
> >> > Ofir 
> >> > 
> >> > -----Original Message----- 
> >> > From: sqlal...@googlegroups.com [mailto:sqlal...@googlegroups.com] On 
> >> > Behalf Of Simon King 
> >> > Sent: Tuesday, September 03, 2013 3:28 PM 
> >> > To: sqlal...@googlegroups.com 
> >> > Subject: Re: [sqlalchemy] Occasional IntegrityError when identifying 
> >> > model 
> >> > not by its ID 
> >> > 
> >> > OK, I agree that doesn't sound like a race as such. It sounds more like 
> >> > some 
> >> > connection state is being shared between requests. 
> >> > 
> >> > I haven't used Tornado before and don't know much about it. Its main 
> >> > feature 
> >> > is that it uses asynchronous IO, unlike most other python web servers 
> >> > that 
> >> > use threads for request handling. How do you make sure that a single 
> >> > request 
> >> > uses a single SQLAlchemy session, and that the session gets closed 
> >> > properly 
> >> > at the end of the request? 
> >> > 
> >> > On Tue, Sep 3, 2013 at 1:08 PM, Ofir Herzas <her...@gmail.com> wrote: 
> >> >> I Agree, but when there is very little load on the system (e.g. 5 
> >> >> logged-in 
> >> >> users)  what are the odds that the same user will get the same error 
> >> >> 10 times? 
> >> >> 
> >> >> On 3 Sep, 2013 3:05 PM, "Simon King" <si...@simonking.org.uk> wrote: 
> >> >>> 
> >> >>> Race conditions can happen at any time, not just when the system is 
> >> >>> under heavy load. You only need 2 requests to arrive at approximately 
> >> >>> the same time to trigger a race condition. 
> >> >>> 
> >> >>> On Tue, Sep 3, 2013 at 12:49 PM, herzaso <her...@gmail.com> wrote: 
> >> >>> > Another important thing - I don't think it's a race condition - the 
> >> >>> > load on the system isn't that high and users getting this error 
> >> >>> > continue to get it all the time 
> >> >>> > 
> >> >>> > 
> >> >>> > On Tuesday, September 3, 2013 2:47:26 PM UTC+3, herzaso wrote: 
> >> >>> >> 
> >> >>> >> I've added session.close in my get method just to see if it solves 
> >> >>> >> the issue and it doesn't! 
> >> >>> >> I'm so frustrated with this issue... not only do I feel helpless, 
> >> >>> >> I don't have any clue on how to get around it ... 
> >> >>> >> 
> >> >>> >> What if I make the change without the session? Would the session 
> >> >>> >> pick up the changes on its first query? 
> >> >>> >> 
> >> >>> >> On Monday, September 2, 2013 3:58:02 PM UTC+3, Simon King wrote: 
> >> >>> >>> 
> >> >>> >>> I'm no expert on isolation levels - I was just trying to help you 
> >> >>> >>> understand what the problem was :-) 
> >> >>> >>> 
> >> >>> >>> Fixing it really depends on how your application is supposed to 
> >> >>> >>> work in the face of concurrent requests. For this specific part 
> >> >>> >>> of the application, you probably want to be able to see the Foo 
> >> >>> >>> object that was created by the other transaction. Reducing the 
> >> >>> >>> transaction isolation is probably the easiest way to do that, but 
> >> >>> >>> might have knock-on effects in your application, so you ought to 
> >> >>> >>> think carefully before doing it. 
> >> >>> >>> 
> >> >>> >>> The alternative is to discard the existing session state when you 
> >> >>> >>> get into this situation (via session.close) and start a new 
> >> > transaction. 
> >> >>> >>> However, it wouldn't be appropriate to do this inside your "get" 
> >> >>> >>> method - "session lifecycle" operations like this really belong 
> >> >>> >>> at an outer scope, so making a change like this may require a 
> >> >>> >>> certain amount of restructuring. 
> >> >>> >>> 
> >> >>> >>> Basically, dealing with concurrent operations is hard, and 
> >> >>> >>> SQLAlchemy isn't going to magically make it any easier I'm afraid. 
> >> >>> >>> 
> >> >>> >>> Simon 
> >> >>> >>> 
> >> >>> >>> On Mon, Sep 2, 2013 at 1:40 PM, herzaso <her...@gmail.com> wrote: 
> >> >>> >>> > I'm sorry, it was a misunderstanding on my part regarding the 
> >> >>> >>> > transactions. 
> >> >>> >>> > So what are you saying? that I should replace the transaction 
> >> >>> >>> > isolation level? 
> >> >>> >>> > 
> >> >>> >>> > 
> >> >>> >>> > On Monday, September 2, 2013 3:29:25 PM UTC+3, Simon King wrote: 
> >> >>> >>> >> 
> >> >>> >>> >> What exactly do you mean by not using transactions? The 
> >> >>> >>> >> Session always works within a transaction: 
> >> >>> >>> >> 
> >> >>> >>> >> 
> >> >>> >>> >> 
> >> >>> >>> >> 
> >> >>> >>> >> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#managin 
> >> >>> >>> >> g-transactions 
> >> >>> >>> >> 
> >> >>> >>> >> I assume you are also using InnoDB tables. 
> >> >>> >>> >> 
> >> >>> >>> >> On Mon, Sep 2, 2013 at 1:19 PM, herzaso <her...@gmail.com> 
> >> >>> >>> >> wrote: 
> >> >>> >>> >> > I do have it set as REPEATABLE READ. 
> >> >>> >>> >> > However, I don't use transactions in sqlalchemy 
> >> >>> >>> >> > 
> >> >>> >>> >> > 
> >> >>> >>> >> > On Monday, September 2, 2013 3:08:58 PM UTC+3, Simon King 
> >> >>> >>> >> > wrote: 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> Do you know what transaction isolation level you are running 
> >> > at? 
> >> >>> >>> >> >> The 
> >> >>> >>> >> >> default apparently is "REPEATABLE READ": 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> http://dev.mysql.com/doc/refman/5.6/en/set-transaction.html 
> >> >>> >>> >> >> #isolevel_repeatable-read 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> The important sentence in that link is: 
> >> >>> >>> >> >> 
> >> >>> >>> >> >>   All consistent reads within the same transaction read the 
> >> >>> >>> >> >> snapshot established by the first read 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> When you query the database for the first time, to see if 
> >> >>> >>> >> >> the entity already exists, you are setting that initial 
> >> >>> >>> >> >> snapshot. If you run the same query again (such as in your 
> >> >>> >>> >> >> exception handler), you will get the same results, whether 
> >> >>> >>> >> >> or not another connection has inserted a matching row in 
> >> >>> >>> >> >> the meantime. 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> Simon 
> >> >>> >>> >> >> 
> >> >>> >>> >> >> On Mon, Sep 2, 2013 at 12:54 PM, herzaso <her...@gmail.com> 
> >> >>> >>> >> >> wrote: 
> >> >>> >>> >> >> > I'm not sure what to make of the results: 
> >> >>> >>> >> >> > On the first connection, I ran BEGIN and INSERT and both 
> >> >>> >>> >> >> > were successful, but when I tried the INSERT statement on 
> >> >>> >>> >> >> > the second connection, I got "ERROR 
> >> >>> >>> >> >> > 1205 (HY000): Lock wait timeout exceeded; try restarting 
> >> >>> >>> >> >> > transaction". 
> >> >>> >>> >> >> > Running the same query on the first connection produced 
> >> >>> >>> >> >> > the required result which is "ERROR 1062 (23000): 
> >> >>> >>> >> >> > Duplicate entry" 
> >> >>> >>> >> >> > After the ROLLBACK on the first connection, the INSERT 
> >> >>> >>> >> >> > statement worked well on the second connection 
> >> >>> >>> >> >> > 
> >> >>> >>> >> >> > Regarding your second remark, the answer is yes, the 
> >> >>> >>> >> >> > error was due to the unique constraint on those columns 
> >> >>> >>> >> >> > 
> >> >>> >>> >> >> > BTW: I'm working on MySQL 
> >> >>> >>> >> >> > 
> >> >>> >>> >> >> > On Monday, September 2, 2013 1:31:12 PM UTC+3, Simon King 
> >> >>> >>> >> >> > wrote: 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> I don't really know the answer, but I'd be interested in 
> >> >>> >>> >> >> >> the results of this experiment: 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> Forget about SQLAlchemy for the moment, and start 2 
> >> >>> >>> >> >> >> plain SQL connections to your database. In the first, 
> >> >>> >>> >> >> >> type something like the 
> >> >>> >>> >> >> >> following: 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> BEGIN; 
> >> >>> >>> >> >> >> INSERT foo(bar, baz, qux) VALUES(1, 1, 1); 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> Now in the second connection do the same. I assume it'll 
> >> >>> >>> >> >> >> fail because of the duplicate values. 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> Now in the first connection issue a "ROLLBACK". You 
> >> >>> >>> >> >> >> should now be in a state where no matching row exists in 
> >> >>> >>> >> >> >> the database, even though you received an error about 
> >> >>> >>> >> >> >> constraint violations. 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> The results you see may be different, depending on your 
> >> >>> >>> >> >> >> transaction isolation level. (It may be that you don't 
> >> >>> >>> >> >> >> get the constraint violation at all until you try to 
> >> >>> >>> >> >> >> commit the second connection). 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> Another thing you could look at: are you sure that the 
> >> >>> >>> >> >> >> error you are getting is due to the unique constraint on 
> >> >>> >>> >> >> >> bar/baz/qux, and not some other constraint in the 
> >> >>> >>> >> >> >> database? 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> Simon 
> >> >>> >>> >> >> >> 
> >> >>> >>> >> >> >> On Mon, Sep 2, 2013 at 8:45 AM, herzaso 
> >> >>> >>> >> >> >> <her...@gmail.com> 
> >> >>> >>> >> >> >> wrote: 
> >> >>> >>> >> >> >> > I'm afraid it didn't solve my problem. 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> > Here is my updated method: 
> >> >>> >>> >> >> >> >     @classmethod 
> >> >>> >>> >> >> >> >     def get(cls, bar=None, baz=None, qux=None, 
> >> >>> >>> >> >> >> > **kwargs): 
> >> >>> >>> >> >> >> >         query = session.query(cls).\ 
> >> >>> >>> >> >> >> >             filter(cls.bar == bar).\ 
> >> >>> >>> >> >> >> >             filter(cls.baz == baz).\ 
> >> >>> >>> >> >> >> >             filter(cls.qux == qux) 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> >         item = query.first() 
> >> >>> >>> >> >> >> >         updated = False 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> >         if not item: 
> >> >>> >>> >> >> >> >             try: 
> >> >>> >>> >> >> >> >                 with session.begin_nested():   # run 
> >> > inside 
> >> >>> >>> >> >> >> > a 
> >> >>> >>> >> >> >> > SAVEPOINT 
> >> >>> >>> >> >> >> >                     updated = True 
> >> >>> >>> >> >> >> >                     item = cls(bar=bar, baz=baz, 
> >> >>> >>> >> >> >> > qux=qux, 
> >> >>> >>> >> >> >> > **kwargs) 
> >> >>> >>> >> >> >> >                     session.add(item) 
> >> >>> >>> >> >> >> >                     session.flush() 
> >> >>> >>> >> >> >> >             except sa.exc.IntegrityError: 
> >> >>> >>> >> >> >> >                 item = query.first() 
> >> >>> >>> >> >> >> >                 if not item: 
> >> >>> >>> >> >> >> >                     raise 
> >> >>> >>> >> >> >> > Exception("invalidIntegrityError") 
> >> >>> >>> >> >> >> >             except: 
> >> >>> >>> >> >> >> >                 raise 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> >         if not updated: 
> >> >>> >>> >> >> >> >             for k, v in kwargs.iteritems(): 
> >> >>> >>> >> >> >> >                 if getattr(item, k) != v: 
> >> >>> >>> >> >> >> >                     setattr(item, k, v) 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> >         return item 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> > With this code, i'm getting invalidIntegrityError. How 
> >> >>> >>> >> >> >> > is it possible? 
> >> >>> >>> >> >> >> > (it's also worth pointing out that this solution 
> >> >>> >>> >> >> >> > requires SA 
> >> >>> >>> >> >> >> > 0.8.2 
> >> >>> >>> >> >> >> > (otherwise, there is a problem with 
> >> >>> >>> >> >> >> > session.begin_nested) 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> > 
> >> >>> >>> >> >> >> > On Tuesday, August 27, 2013 6:40:03 PM UTC+3, Michael 
> >> >>> >>> >> >> >> > Bayer 
> >> >>> >>> >> >> >> > wrote: 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> I'm not a fan of catching integrity errors, i prefer 
> >> >>> >>> >> >> >> >> to try to make sure they aren't going to happen, or 
> >> >>> >>> >> >> >> >> if they are, they aren't a normal occurrence and the 
> >> >>> >>> >> >> >> >> system is such that the particular operation can just 
> >> >>> >>> >> >> >> >> fail (of 
> >> >>> >>> >> >> >> >> course it depends on what it is).     A problem with 
> >> >>> >>> >> >> >> >> catching 
> >> >>> >>> >> >> >> >> the 
> >> >>> >>> >> >> >> >> integrity 
> >> >>> >>> >> >> >> >> error due to concurrent, conflicting operations is 
> >> >>> >>> >> >> >> >> that depending on backend and isolation level, you 
> >> >>> >>> >> >> >> >> can't be totally sure when the error is going to get 
> >> >>> >>> >> >> >> >> raised (e.g. serializable isolation vs. non).  Also 
> >> >>> >>> >> >> >> >> on a backend like Postgresql, the database can't 
> >> >>> >>> >> >> >> >> recover the transaction after an integrity error 
> >> >>> >>> >> >> >> >> unless you used a savepoint. 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> But here you're doing the "concurrent transactions 
> >> >>> >>> >> >> >> >> need row identity X", so maybe it is appropriate 
> >> >>> >>> >> >> >> >> here.  Here is a rough idea of a transactional 
> >> >>> >>> >> >> >> >> pattern for that, noting this isn't tested: 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> try: 
> >> >>> >>> >> >> >> >>     my_object = 
> >> >>> >>> >> >> >> >> Session.query(MyClass).filter(....).one() 
> >> >>> >>> >> >> >> >> except NoResultFound: 
> >> >>> >>> >> >> >> >>     try: 
> >> >>> >>> >> >> >> >>         with Session.begin_nested():   # run inside a 
> >> >>> >>> >> >> >> >> SAVEPOINT 
> >> >>> >>> >> >> >> >>             my_object = MyClass(...) 
> >> >>> >>> >> >> >> >>             Session.add(my_object) 
> >> >>> >>> >> >> >> >>             Session.flush() 
> >> >>> >>> >> >> >> >>     except IntegrityError: 
> >> >>> >>> >> >> >> >>         my_object = 
> >> >>> >>> >> >> >> >> Session.query(MyClass).filter(....).one() 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> On Aug 27, 2013, at 11:13 AM, herzaso 
> >> >>> >>> >> >> >> >> <her...@gmail.com> 
> >> >>> >>> >> >> >> >> wrote: 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> Suppose we are looking at a race condition, do you 
> >> >>> >>> >> >> >> >> also think this should be handled by catching the 
> >> >>> >>> >> >> >> >> IntegrityError? 
> >> >>> >>> >> >> >> >> If so, what should I do? only flush and do the 
> >> >>> >>> >> >> >> >> operation again? 
> >> >>> >>> >> >> >> >> 
> >> >>> >>> >> >> >> >> On Tuesday, August 27, 2013 5:42:23 PM UTC+3, Michael 
> >> >>> >>> >> >> >> >> Bayer 
> >> >>> >>> >> >> >> >> wrote: 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> the word "occasional" is very meaningful.  It 
> >> >>> >>> >> >> >> >>> usually suggests race 
> >> >>> >>> >> >> >> >>> conditions.    Then with the word "tornado", the 
> >> >>> >>> >> >> >> >>> baysean 
> >> >>> >>> >> >> >> >>> filters 
> >> >>> >>> >> >> >> >>> are 
> >> >>> >>> >> >> >> >>> strongly leaning towards "race condition" at that 
> >> >>> >>> >> >> >> >>> point :). 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> if an error is occurring only under volume then you 
> >> >>> >>> >> >> >> >>> have to revisit where race conditions can occur. 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> On Aug 27, 2013, at 10:32 AM, herzaso 
> >> >>> >>> >> >> >> >>> <her...@gmail.com> 
> >> >>> >>> >> >> >> >>> wrote: 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> I'm running a Tornado server without redundancy 
> >> >>> >>> >> >> >> >>> (only one process, requests can arrive at the same 
> >> >>> >>> >> >> >> >>> time but will be handled one at a 
> >> >>> >>> >> >> >> >>> time) 
> >> >>> >>> >> >> >> >>> I do agree that for large volumes, catching the 
> >> >>> >>> >> >> >> >>> IntegrityError would be better, but currently I am 
> >> >>> >>> >> >> >> >>> handling a single request at a time and I want to 
> >> >>> >>> >> >> >> >>> fix this problem before I move on ... 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> 
> >> >>> >>> >> >> >> >>> On Tuesday, August 27, 2013 5:24:07 PM UTC+3, Simon 
> >> >>> >>> >> >> >> >>> King 
> >> >>> >>> >> >> >> >>> wrote: 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> On Tue, Aug 27, 2013 at 2:31 PM, herzaso 
> >> >>> >>> >> >> >> >>>> <her...@gmail.com> 
> >> >>> >>> >> >> >> >>>> wrote: 
> >> >>> >>> >> >> >> >>>> > On Tuesday, August 27, 2013 3:55:50 PM UTC+3, 
> >> >>> >>> >> >> >> >>>> > Simon King 
> >> >>> >>> >> >> >> >>>> > wrote: 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> On Tue, Aug 27, 2013 at 1:40 PM, herzaso 
> >> >>> >>> >> >> >> >>>> >> <her...@gmail.com> 
> >> >>> >>> >> >> >> >>>> >> wrote: 
> >> >>> >>> >> >> >> >>>> >> > I have a model with an ID column set as the 
> >> >>> >>> >> >> >> >>>> >> > primary key, though i'd like to be able to 
> >> >>> >>> >> >> >> >>>> >> > identify records by 3 other columns. 
> >> >>> >>> >> >> >> >>>> >> > For this situation, I've added a classmethod 
> >> >>> >>> >> >> >> >>>> >> > that will fetch the record if found or a new 
> >> >>> >>> >> >> >> >>>> >> > record if not. 
> >> >>> >>> >> >> >> >>>> >> > The problem i'm having is that every once in a 
> >> >>> >>> >> >> >> >>>> >> > while, I get IntegrityError trying to flush a 
> >> >>> >>> >> >> >> >>>> >> > change 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> > class Foo(Base): 
> >> >>> >>> >> >> >> >>>> >> >     __table_args__ = 
> >> >>> >>> >> >> >> >>>> >> > (sa.UniqueConstraint('bar', 'baz', 
> >> >>> >>> >> >> >> >>>> >> > 'qux'),) 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> >     id = sa.Column(Identifier, 
> >> >>> >>> >> >> >> >>>> >> > sa.Sequence('%s_id_seq' 
> >> >>> >>> >> >> >> >>>> >> > % 
> >> >>> >>> >> >> >> >>>> >> > __tablename__), nullable=False, 
> >> >>> >>> >> >> >> >>>> >> > primary_key=True) 
> >> >>> >>> >> >> >> >>>> >> >     bar = sa.Column(sa.BigInteger) 
> >> >>> >>> >> >> >> >>>> >> >     baz = sa.Column(sa.BigInteger) 
> >> >>> >>> >> >> >> >>>> >> >     qux = sa.Column(sa.BigInteger) 
> >> >>> >>> >> >> >> >>>> >> >     a1 = sa.Column(sa.BigInteger) 
> >> >>> >>> >> >> >> >>>> >> >     a2 = sa.Column(sa.BigInteger) 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> >     @classmethod 
> >> >>> >>> >> >> >> >>>> >> >     def get(cls, bar=None, baz=None, qux=None, 
> >> >>> >>> >> >> >> >>>> >> > **kwargs): 
> >> >>> >>> >> >> >> >>>> >> >         item = session.query(cls).\ 
> >> >>> >>> >> >> >> >>>> >> >             filter(cls.bar== bar).\ 
> >> >>> >>> >> >> >> >>>> >> >             filter(cls.baz == baz).\ 
> >> >>> >>> >> >> >> >>>> >> >             filter(cls.qux == qux).\ 
> >> >>> >>> >> >> >> >>>> >> >             first() 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> >         if item: 
> >> >>> >>> >> >> >> >>>> >> >             for k, v in kwargs.iteritems(): 
> >> >>> >>> >> >> >> >>>> >> >                 if getattr(item, k) != v: 
> >> >>> >>> >> >> >> >>>> >> >                     setattr(item, k, v) 
> >> >>> >>> >> >> >> >>>> >> >         else: 
> >> >>> >>> >> >> >> >>>> >> >             item = cls(bar=bar, baz=baz, 
> >> >>> >>> >> >> >> >>>> >> > qux=qux, 
> >> >>> >>> >> >> >> >>>> >> > **kwargs) 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> >         return item 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> > This is the code I use to add/update records: 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> > foo = Foo.get(**item) 
> >> >>> >>> >> >> >> >>>> >> > session.merge(foo) 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> > I'm struggling with this problem for some time 
> >> >>> >>> >> >> >> >>>> >> > now, and would appreciate any help ... 
> >> >>> >>> >> >> >> >>>> >> > 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> I'm not sure of the exact problem, but there are 
> >> >>> >>> >> >> >> >>>> >> a couple of things that you could investigate. 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> Firstly, session.merge returns a copy of the 
> >> >>> >>> >> >> >> >>>> >> object, rather than adding the object that you 
> >> >>> >>> >> >> >> >>>> >> supplied into the session. 
> >> >>> >>> >> >> >> >>>> >> See 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> http://docs.sqlalchemy.org/en/rel_0_8/orm/sessio 
> >> >>> >>> >> >> >> >>>> >> n.html#merging 
> >> >>> >>> >> >> >> >>>> >> for 
> >> >>> >>> >> >> >> >>>> >> details. 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> Secondly, your "get" method sometimes returns 
> >> >>> >>> >> >> >> >>>> >> objects that are already part of the session (if 
> >> >>> >>> >> >> >> >>>> >> they were in the database), and sometimes 
> >> >>> >>> >> >> >> >>>> >> objects that are not in the session. It would 
> >> >>> >>> >> >> >> >>>> >> probably be more consistent to always return 
> >> >>> >>> >> >> >> >>>> >> objects that are part of the session, by putting 
> >> >>> >>> >> >> >> >>>> >> "session.add(item)" in your "else" clause. 
> >> >>> >>> >> >> >> >>>> >> This 
> >> >>> >>> >> >> >> >>>> >> would 
> >> >>> >>> >> >> >> >>>> >> get 
> >> >>> >>> >> >> >> >>>> >> rid 
> >> >>> >>> >> >> >> >>>> >> of the need for session.merge(). (If you want to 
> >> >>> >>> >> >> >> >>>> >> be able to use the "get" with non-global 
> >> >>> >>> >> >> >> >>>> >> sessions, pass the session as a 
> >> >>> >>> >> >> >> >>>> >> parameter.) 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> Finally, if your session isn't auto-flushing, it 
> >> >>> >>> >> >> >> >>>> >> would be possible for you to call "get" twice 
> >> >>> >>> >> >> >> >>>> >> with the same parameters and get 
> >> >>> >>> >> >> >> >>>> >> 2 
> >> >>> >>> >> >> >> >>>> >> different 
> >> >>> >>> >> >> >> >>>> >> objects back. 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> You may want to look at the UniqueObject recipe 
> >> >>> >>> >> >> >> >>>> >> in the 
> >> >>> >>> >> >> >> >>>> >> wiki: 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> >> http://www.sqlalchemy.org/trac/wiki/UsageRecipes 
> >> >>> >>> >> >> >> >>>> >> /UniqueObject 
> >> >>> >>> >> >> >> >>>> >> 
> >> >>> >>> >> >> >> >>>> > Hi Simon, 
> >> >>> >>> >> >> >> >>>> > Thanks for the fast reply. 
> >> >>> >>> >> >> >> >>>> > 
> >> >>> >>> >> >> >> >>>> > I tried adding session.add(item) and 
> >> >>> >>> >> >> >> >>>> > session.flush() in the else clause in the past 
> >> >>> >>> >> >> >> >>>> > but that didn't solve my problem. 
> >> >>> >>> >> >> >> >>>> > I didn't however remove the merge, do you think 
> >> >>> >>> >> >> >> >>>> > that might be the problem? 
> >> >>> >>> >> >> >> >>>> > 
> >> >>> >>> >> >> >> >>>> > Regarding the flush, this code is part of an API 
> >> >>> >>> >> >> >> >>>> > server where a scoped_session is committed after 
> >> >>> >>> >> >> >> >>>> > each change. I haven't changed the autoflush 
> >> >>> >>> >> >> >> >>>> > parameter, and as I understand the default value 
> >> >>> >>> >> >> >> >>>> > is True making a flush before each commit or 
> >> >>> >>> >> >> >> >>>> > query. 
> >> >>> >>> >> >> >> >>>> > 
> >> >>> >>> >> >> >> >>>> > As for the UniqueObject recipe, thanks! Amazing 
> >> >>> >>> >> >> >> >>>> > that I never found it searching for a cure. As I 
> >> >>> >>> >> >> >> >>>> > see it basically does the same ... 
> >> >>> >>> >> >> >> >>>> > 
> >> >>> >>> >> >> >> >>>> > I never managed to reproduce this bug on my 
> >> >>> >>> >> >> >> >>>> > development environment. 
> >> >>> >>> >> >> >> >>>> > It only 
> >> >>> >>> >> >> >> >>>> > happens in my production environment. 
> >> >>> >>> >> >> >> >>>> > Do you suppose adding a session.add and removing 
> >> >>> >>> >> >> >> >>>> > the merge will solve this issue? 
> >> >>> >>> >> >> >> >>>> > 
> >> >>> >>> >> >> >> >>>> > Thanks, 
> >> >>> >>> >> >> >> >>>> > Ofir 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> It's difficult to say without knowing more about 
> >> >>> >>> >> >> >> >>>> your system. 
> >> >>> >>> >> >> >> >>>> For 
> >> >>> >>> >> >> >> >>>> example, does your production system get multiple 
> >> >>> >>> >> >> >> >>>> concurrent API requests, or are they serialised? If 
> >> >>> >>> >> >> >> >>>> 2 requests can come in at approximately the same 
> >> >>> >>> >> >> >> >>>> time and are handled by 2 different threads (or 
> >> >>> >>> >> >> >> >>>> processes), then it is easy to imagine that the 
> >> >>> >>> >> >> >> >>>> first handler will check the database, find that an 
> >> >>> >>> >> >> >> >>>> entry doesn't exist, and create it. 
> >> >>> >>> >> >> >> >>>> But before it flushes the change to the database 
> >> >>> >>> >> >> >> >>>> (or even after it flushes, but before it commits, 
> >> >>> >>> >> >> >> >>>> depending on your transaction isolation), the 
> >> >>> >>> >> >> >> >>>> second handler will check for the same object, find 
> >> >>> >>> >> >> >> >>>> it missing, and so create it. 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> To track down problems like this, you could ensure 
> >> >>> >>> >> >> >> >>>> that your development environment has the same 
> >> >>> >>> >> >> >> >>>> thread/process behaviour as the production 
> >> >>> >>> >> >> >> >>>> environment, then try submitting multiple 
> >> >>> >>> >> >> >> >>>> concurrent requests to it. If you add "time.sleep" 
> >> >>> >>> >> >> >> >>>> statements somewhere between the creation of the 
> >> >>> >>> >> >> >> >>>> object and the commit of the transaction you will 
> >> >>> >>> >> >> >> >>>> probably find it easier to trigger. 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> To actually fix the problem, you could choose to 
> >> >>> >>> >> >> >> >>>> only handle a single request at a time (fine if you 
> >> >>> >>> >> >> >> >>>> don't expect a high volume of requests). If that's 
> >> >>> >>> >> >> >> >>>> not acceptable, you could catch the IntegrityError 
> >> >>> >>> >> >> >> >>>> and then re-process the request. 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> Hope that helps, 
> >> >>> >>> >> >> >> >>>> 
> >> >>> >>> >> >> >> >>>> Simon 
> >> >>> >>> >> >> >> >>> 
> > 
> > -- 
> > You received this message because you are subscribed to the Google Groups 
> > "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send an 
> > email to sqlalchemy+...@googlegroups.com. 
> > To post to this group, send email to sqlal...@googlegroups.com. 
> > Visit this group at http://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/groups/opt_out. 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at http://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/groups/opt_out.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Occasional IntegrityError when identifying model not by its ID

Reply via email to