Re: [sqlalchemy] Session.add performance

Michael Bayer Thu, 16 Dec 2010 08:10:42 -0800

On Dec 16, 2010, at 10:31 AM, Michael Bayer wrote:

> 
> On Dec 16, 2010, at 12:39 AM, Julian Scheid wrote:
> 
>> In an application that is heavy on inserts and updates, cProfile
>> output is dominated by Session.add in which about 45% of time is
>> spent. Most of that time, in turn, is spent in cascade_iterator (43%).
>> I can provide more detailed information if needed.
>> 
>> The application does aggressive caching of data and has set
>> expire_on_commit=False, in order to keep database load down. Is that
>> the reason for Session.add slowness?
>> 
>> Is there a way I can speed this up while keeping a similar level of
>> cache aggressiveness?
>> 
>> For example, in one test run Session.__contains__ was invoked 25m
>> times over the course of only a few minutes, accounting for 27% of
>> total time spent.  Could it be a good idea to try and override this
>> function with one that's optimized for this specific use case?
>> 
>> Also, so far I haven't spent any effort expunging objects from the
>> session as soon as possible.  Some objects might linger for longer
>> than necessary.  Would they contribute to Session.add's overhead?
> 
> A major part of development resources as of late have been focused on add() 
> and cascade_iterator().   I would advise trying out the 0.7 tip from 
> mercurial where we've cut out a lot of overhead out of many areas of the 
> flush including add() + cascade_iterator (see 
> http://techspot.zzzeek.org/2010/12/12/a-tale-of-three-profiles/ for some 
> profiling output).  
> 
> Things like inlining Session.__contains__ are good ideas if they are shown to 
> be prominent in a slow profile, so if you want to send along a test script to 
> me that illustrates your bottlenecks I can work on its pain points and add it 
> to our suite.


I uploaded runsnakerun details from add() + cascade_iterator for current 0.6.6 
and 0.7 tips, which is against the test program run in that post.   Its a total 
of 11,000 add() calls.  0.7 is on the top.   0.7's larger percentage overall is 
due to performance increases elsewhere.   The fact that its a lower percentage 
than your case is probably due to the lower number of relationships() in the 
test script, since cascade_iterator() increases in time for each relationship.

http://imgur.com/a/SNkhq




> 
> 
> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "sqlalchemy" group.
>> To post to this group, send email to sqlalch...@googlegroups.com.
>> To unsubscribe from this group, send email to 
>> sqlalchemy+unsubscr...@googlegroups.com.
>> For more options, visit this group at 
>> http://groups.google.com/group/sqlalchemy?hl=en.
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To post to this group, send email to sqlalch...@googlegroups.com.
> To unsubscribe from this group, send email to 
> sqlalchemy+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/sqlalchemy?hl=en.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Session.add performance

Reply via email to