On Tue, Aug 27, 2013 at 2:31 PM, herzaso <herz...@gmail.com> wrote:
> On Tuesday, August 27, 2013 3:55:50 PM UTC+3, Simon King wrote:
>>
>> On Tue, Aug 27, 2013 at 1:40 PM, herzaso <her...@gmail.com> wrote:
>> > I have a model with an ID column set as the primary key, though i'd like
>> > to
>> > be able to identify records by 3 other columns.
>> > For this situation, I've added a classmethod that will fetch the record
>> > if
>> > found or a new record if not.
>> > The problem i'm having is that every once in a while, I get
>> > IntegrityError
>> > trying to flush a change
>> >
>> > class Foo(Base):
>> >     __table_args__ = (sa.UniqueConstraint('bar', 'baz', 'qux'),)
>> >
>> >     id = sa.Column(Identifier, sa.Sequence('%s_id_seq' % __tablename__),
>> > nullable=False, primary_key=True)
>> >     bar = sa.Column(sa.BigInteger)
>> >     baz = sa.Column(sa.BigInteger)
>> >     qux = sa.Column(sa.BigInteger)
>> >     a1 = sa.Column(sa.BigInteger)
>> >     a2 = sa.Column(sa.BigInteger)
>> >
>> >     @classmethod
>> >     def get(cls, bar=None, baz=None, qux=None, **kwargs):
>> >         item = session.query(cls).\
>> >             filter(cls.bar== bar).\
>> >             filter(cls.baz == baz).\
>> >             filter(cls.qux == qux).\
>> >             first()
>> >
>> >         if item:
>> >             for k, v in kwargs.iteritems():
>> >                 if getattr(item, k) != v:
>> >                     setattr(item, k, v)
>> >         else:
>> >             item = cls(bar=bar, baz=baz, qux=qux, **kwargs)
>> >
>> >         return item
>> >
>> > This is the code I use to add/update records:
>> >
>> > foo = Foo.get(**item)
>> > session.merge(foo)
>> >
>> > I'm struggling with this problem for some time now, and would appreciate
>> > any
>> > help ...
>> >
>>
>> I'm not sure of the exact problem, but there are a couple of things
>> that you could investigate.
>>
>> Firstly, session.merge returns a copy of the object, rather than
>> adding the object that you supplied into the session. See
>> http://docs.sqlalchemy.org/en/rel_0_8/orm/session.html#merging for
>> details.
>>
>> Secondly, your "get" method sometimes returns objects that are already
>> part of the session (if they were in the database), and sometimes
>> objects that are not in the session. It would probably be more
>> consistent to always return objects that are part of the session, by
>> putting "session.add(item)" in your "else" clause. This would get rid
>> of the need for session.merge(). (If you want to be able to use the
>> "get" with non-global sessions, pass the session as a parameter.)
>>
>> Finally, if your session isn't auto-flushing, it would be possible for
>> you to call "get" twice with the same parameters and get 2 different
>> objects back.
>>
>> You may want to look at the UniqueObject recipe in the wiki:
>> http://www.sqlalchemy.org/trac/wiki/UsageRecipes/UniqueObject
>>
> Hi Simon,
> Thanks for the fast reply.
>
> I tried adding session.add(item) and session.flush() in the else clause in
> the past but that didn't solve my problem.
> I didn't however remove the merge, do you think that might be the problem?
>
> Regarding the flush, this code is part of an API server where a
> scoped_session is committed after each change. I haven't changed the
> autoflush parameter, and as I understand the default value is True making a
> flush before each commit or query.
>
> As for the UniqueObject recipe, thanks! Amazing that I never found it
> searching for a cure. As I see it basically does the same ...
>
> I never managed to reproduce this bug on my development environment. It only
> happens in my production environment.
> Do you suppose adding a session.add and removing the merge will solve this
> issue?
>
> Thanks,
> Ofir

It's difficult to say without knowing more about your system. For
example, does your production system get multiple concurrent API
requests, or are they serialised? If 2 requests can come in at
approximately the same time and are handled by 2 different threads (or
processes), then it is easy to imagine that the first handler will
check the database, find that an entry doesn't exist, and create it.
But before it flushes the change to the database (or even after it
flushes, but before it commits, depending on your transaction
isolation), the second handler will check for the same object, find it
missing, and so create it.

To track down problems like this, you could ensure that your
development environment has the same thread/process behaviour as the
production environment, then try submitting multiple concurrent
requests to it. If you add "time.sleep" statements somewhere between
the creation of the object and the commit of the transaction you will
probably find it easier to trigger.

To actually fix the problem, you could choose to only handle a single
request at a time (fine if you don't expect a high volume of
requests). If that's not acceptable, you could catch the
IntegrityError and then re-process the request.

Hope that helps,

Simon

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to