On Dec 3, 2011, at 2:31 PM, kris wrote: > I have a database migration using sqlalchemy migrate that needs to migrate > several million rows. > After running for several days, I see the job has grown to consume all > available memory and > is swapping. > > Can someone point me to a description of what is the best way to manage > memory in the session > during a long transaction? > i.e. > > 1. Does session.flush remove instances of the objects in the session > 2. Should I break the migration into several subtransactions? > 3. use expunge at different times?
You should at least be calling flush() every thousand records or so, and ensure that none of the objects involved in that flush remain strongly referenced afterwards. When an object is "clean", it is only weakly referenced by the Session so will fall out of scope naturally. The Session also maintains a list of objects that came into the current transaction as "new", so that if a rollback occurs, these can be evicted, but that list is also self-cleaning when the newly inserted objects are dereferenced. Making sure you flush() periodically and don't maintain references in memory should allow you an app that runs forever without memory growth. The other thing to watch out for is very large selects. Most DBAPIs fully buffer a result set in memory, and the ORM by default does this as well. If you select tens of thousands of rows in one result set, the memory usage of the Python interpreter will grow to fit that space, and then generally never be returned to the OS (CPython behavior). So try to avoid having very large, single result sets - there are ways to get an ORM result to "stream" but there are a bunch of limitations (only certain backends, can't use eager loading, etc.), it's better to avoid it. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.