When we want to test if a Python program has a "leak", we do that via seeing how many uncollected objects are present. This is done via gc:
import gc print "total number of objects:", len(gc.get_objects()) That's the only real way to measure if the memory used by Python objects is growing unbounded. Looking at the memory usage on "top" shows what the interpreter takes up - the CPython interpreter in more modern releases does release memory back, but only occasionally. Older versions don't. If you're doing an operation that loads thousands of rows, those rows are virtually always loaded entirely into memory by the DBAPI, before your program or SQLAlchemy is ever given the chance to fetch a single row. I haven't yet looked closely at your case here, but that's often at the core of scripts that use much more memory than expected. There's ways to get *some* DBAPIs to not do this (particularly psycopg2, if you're using Postgresql, see http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per and http://docs.sqlalchemy.org/en/latest/core/connections.html?highlight=stream_results#sqlalchemy.engine.base.Connection.execution_options), though the better solution is to usually try loading chunks of records in at a time (one such recipe that I use for this is here: http://www.sqlalchemy.org/trac/wiki/UsageRecipes/WindowedRangeQuery) . Or better yet consider if the problem can be solved entirely on the SQL side (this entirely depends on exactly what you're trying to do with the data in question). On Feb 22, 2012, at 9:46 AM, Vlad K. wrote: > > Okay, after several test cases, various join combinations with or without > relationships, with or without cherrypicking columns that are really used > from the joined models, I've come to the conclusion that the only problem I'm > having here is that there is no garbage collection. Python memory use just > keeps growing at a rate that, of course, depends on the size of models used > and data queried, but it just keeps growing, regardless of release/deletion > of instances or isolating one row processing in its own committed transaction. > > I also found this: > > http://permalink.gmane.org/gmane.comp.python.sqlalchemy.user/30087 > > > So it appears I'm having the same problem. > > > Am I understanding correctly that because of this, SQLAlchemy ORM is in my > case useless if I have to process thousands of rows, because the memory used > to process each row (along with corresponding joined models etc...) will not > be released? So basically I'd have to use SQLA without the ORM, for this > particular use case? > > Or is this some memory leak bug? > > If so, any suggestions, examples on how do I switch from ORM use to non-ORM > if I want to retain the named tuples returned by queries and avoid rewriting > half the app? > > > Thanks. > > > .oO V Oo. > > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To post to this group, send email to sqlalchemy@googlegroups.com. > To unsubscribe from this group, send email to > sqlalchemy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. > -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.