Ed Rahn <edsr...@gmail.com> wrote:

> 
> On 01/10/2015 04:04 PM, Michael Bayer wrote:
>> Ed Rahn <edsr...@gmail.com> wrote:
>> 
>>> On 01/10/2015 01:51 PM, Jonathan Vanasco wrote:
>>>> It's hard to tell what could cause the issue:
>>>> 
>>>> How many horses are there?
>>> 463827
>>>> What is going on in that other function?
>>> Just a regex search, two selects and a commit
>>>> Are there foreign key checks involved with the commit?
>>> Yep
>>> 
>>> 
>>> When I set up the DB to log all queries, I can see the commit happening in 
>>> less than a second. So I'm assured it's somewhere between my function and 
>>> the DB.
>>> 
>>>> If you want to break a huge batch into smaller commits, there is the 
>>>> `Windowed Range Query` strategy -- 
>>>> https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/WindowedRangeQuery
>>> That's kinda what I'm doing right now, but it just seems so hacky. I'd like 
>>> to find a solution for the problem. From that page "that's too large to 
>>> fetch all at once", I have plenty of memory, how can I tell SQLA or 
>>> psycopg2 to load all the results into memory?
>> first off, psycopg2 loads everything into memory at once in all cases, as 
>> does the SQLAlchemy ORM, unless specific steps are taken to not do this.    
>> For 460000 objects, this will be *very* slow, that’s a lot of rows to turn 
>> into fully persisted and session-tracked Python objects, and in Python even 
>> allocating the memory to hold them all will build up palpable time.
>> 
>> However, this is all before that commit(), which is where you claim the 
>> slowness is.   This seems doubtful, but if it really is in the commit() and 
>> not just the fetch, then something else would be going on.  If you have 
>> 400000 objects in memory, and they are all clean, session.commit() is not 
>> slow at all.   Unless, you have some kind of event going on that is tracking 
>> them all, or you’re using a very old version of SQLAlchemy with mutable 
>> types, or something else that I could only guess (and I try not to spend 
>> time guessing on this list).
>> 
>> So before you begin trying to solve a problem, it’s best to understand the 
>> nature of the problem, so I’d suggest walking through the steps in 
>> SQLAlchemy’s own frequently asked questions (as this is certainly a question 
>> that is frequently asked, make no mistake!)  "How can I profile a SQLAlchemy 
>> powered application?”  at 
>> http://docs.sqlalchemy.org/en/rel_0_9/faq/performance.html#how-can-i-profile-a-sqlalchemy-powered-application.
>>    This is the very best list of steps to fully analyze where things are 
>> going wrong.   In your case I think something is going to stand out under 
>> code profiling.   It may very well be something SQLAlchemy is doing that is 
>> taking too long; if we identify it, we can talk about strategies to mitigate 
>> whatever it might be (there are always ways to mitigate).
> I didn't get a chance to look at this until now. However the commit after the 
> select never returns, so I don't know why that changed.
> 
> Profiling the code says that the most time is taken in _expire in state.py
> https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/orm/state.py#L360
> 
> Where should I go from here?

turn off expire_on_commit, and/or reduce the number of objects that are 
resident in the Session at one time (at least when the commit happens).


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to