[sqlalchemy] Removing a session (Really)

2011-06-26 Thread Warwick Prince
Hi Michael

I'm having an issue with memory usage that I would appreciate some insight..

I have a fairly straight forward process, that works perfectly as far as it 
delivering the desired updates in the DB etc, however, it accumulates memory 
usage (just like a leak) and I can not find a way to release it.

I have a multi threaded process.  Each thread creates a scopedsession from the 
standard global Session = scoped_session(sessionmaker()) construct.  Each 
thread does some work using mapped objects to update some and add some rows 
into the (MySQL on Windows) DB.  All this works perfectly and as expected.   
Due to various reasons, I flush/commit after each row is updated/inserted.   
After the batch of updates is complete, I come back and session.remove() (In 
an attempt to dump the session) and then wait for a while and do the entire 
thing again.  At the start of each run, I create a new session=Session() and do 
the updates and return and session.remove().

To me, I would assume that the memory would be the session's cache of objects 
that are being managed - which I can understand.  What I can't understand is 
why when I delete *everything* e.g. del engine, del meta, del session and even 
stop the thread, the memory is still consumed.I must stop the entire 
process before the memory is returned to the system.   After around 10 hours of 
running, I've used 2Gb+ of memory and everything crashes.

BTW: I have created a version of my code that does everything EXCEPT the SA 
part(s), and no memory is being used at all. (Just checking that it wasn't my 
own code causing the issue!)  i.e. It loops over the other database (non SQL) 
reading all the data that I WOULD use to update the SQL database using SA.  
When SA is not involved, nothing is happening with the memory.

Any hint on how I can a) see what is being held and b) dump it!

I'm using 0.6.3 and Python 2.6.3 on Windows.   BTW: I tried to update to latest 
7.x and Python 2.7.2 however that broke everything in a spectacular way - I'll 
leave that one for another day..

Cheers
Warwick
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Removing a session (Really)

2011-06-26 Thread Michael Bayer

On Jun 26, 2011, at 4:31 AM, Warwick Prince wrote:

 Hi Michael
 
 I'm having an issue with memory usage that I would appreciate some insight..
 
 I have a fairly straight forward process, that works perfectly as far as it 
 delivering the desired updates in the DB etc, however, it accumulates memory 
 usage (just like a leak) and I can not find a way to release it.
 
 I have a multi threaded process.  Each thread creates a scopedsession from 
 the standard global Session = scoped_session(sessionmaker()) construct.  Each 
 thread does some work using mapped objects to update some and add some rows 
 into the (MySQL on Windows) DB.  All this works perfectly and as expected.   
 Due to various reasons, I flush/commit after each row is updated/inserted.   
 After the batch of updates is complete, I come back and session.remove() 
 (In an attempt to dump the session) and then wait for a while and do the 
 entire thing again.  At the start of each run, I create a new 
 session=Session() and do the updates and return and session.remove().
 
 To me, I would assume that the memory would be the session's cache of objects 
 that are being managed - which I can understand.  What I can't understand is 
 why when I delete *everything* e.g. del engine, del meta, del session and 
 even stop the thread, the memory is still consumed.I must stop the entire 
 process before the memory is returned to the system.   After around 10 hours 
 of running, I've used 2Gb+ of memory and everything crashes.

By the memory is still consumed, if you're talking about the memory of your 
process, that's Python's behavior - once the size of memory usage grows to X, 
it stays at X no matter what you dereference within the process.  So the key is 
to manage how large a collection ever gets filled up in the first place.The 
only true measure of python objects being leaked is the size of 
gc.get_objects().  If that size is managed, that's as far as Python code can go 
towards managing memory.

So I'm assuming you just mean the size of the process. If you're dealing 
with large numbers of rows being loaded into memory, you'd need to cut down on 
the maximum size of objects loaded at once.

The Session does not strongly reference anything, except for that which is 
present in the .new and .dirty collections.   If those are empty, it is not 
strongly referencing anything, and as long as gc is enabled, the number of 
objects in memory will be managed.Older versions of Session in 0.5, 0.4 and 
such were not as good at this, but in 0.6, 0.7 it's quite solid, there is a 
whole suite of unit tests that ensure SQLAlchemy components like Engine, 
Session, schema, etc. do not leak memory under a variety of setup/teardown 
situations.  But it seems like you're don't yet know if you're experiencing 
a problem at the Python object level.



 i.e. It loops over the other database (non SQL) reading all the data that I 
 WOULD use to update the SQL database using SA.  When SA is not involved, 
 nothing is happening with the memory.

note that DBAPIs, particularly older versions of MySQLdb, may have memory 
leaks, and most DBAPIs when asked to fetch a result will load the full set of 
results into memory before fetchone() is ever called, thus causing a great 
unconditional increase in the size of memory if you are fetching very large 
result sets.

 
 Any hint on how I can a) see what is being held and b) dump it!

gc.get_objects()


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.



Re: [sqlalchemy] Removing a session (Really)

2011-06-26 Thread Warwick Prince
Excellent - thanks  :-)

Warwick

On 27/06/2011, at 2:37 AM, Michael Bayer wrote:

 
 On Jun 26, 2011, at 4:31 AM, Warwick Prince wrote:
 
 Hi Michael
 
 I'm having an issue with memory usage that I would appreciate some insight..
 
 I have a fairly straight forward process, that works perfectly as far as it 
 delivering the desired updates in the DB etc, however, it accumulates memory 
 usage (just like a leak) and I can not find a way to release it.
 
 I have a multi threaded process.  Each thread creates a scopedsession from 
 the standard global Session = scoped_session(sessionmaker()) construct.  
 Each thread does some work using mapped objects to update some and add some 
 rows into the (MySQL on Windows) DB.  All this works perfectly and as 
 expected.   Due to various reasons, I flush/commit after each row is 
 updated/inserted.   After the batch of updates is complete, I come back 
 and session.remove() (In an attempt to dump the session) and then wait for a 
 while and do the entire thing again.  At the start of each run, I create a 
 new session=Session() and do the updates and return and session.remove().
 
 To me, I would assume that the memory would be the session's cache of 
 objects that are being managed - which I can understand.  What I can't 
 understand is why when I delete *everything* e.g. del engine, del meta, del 
 session and even stop the thread, the memory is still consumed.I must 
 stop the entire process before the memory is returned to the system.   After 
 around 10 hours of running, I've used 2Gb+ of memory and everything crashes.
 
 By the memory is still consumed, if you're talking about the memory of your 
 process, that's Python's behavior - once the size of memory usage grows to X, 
 it stays at X no matter what you dereference within the process.  So the key 
 is to manage how large a collection ever gets filled up in the first place.   
  The only true measure of python objects being leaked is the size of 
 gc.get_objects().  If that size is managed, that's as far as Python code can 
 go towards managing memory.
 
 So I'm assuming you just mean the size of the process. If you're dealing 
 with large numbers of rows being loaded into memory, you'd need to cut down 
 on the maximum size of objects loaded at once.
 
 The Session does not strongly reference anything, except for that which is 
 present in the .new and .dirty collections.   If those are empty, it is 
 not strongly referencing anything, and as long as gc is enabled, the number 
 of objects in memory will be managed.Older versions of Session in 0.5, 
 0.4 and such were not as good at this, but in 0.6, 0.7 it's quite solid, 
 there is a whole suite of unit tests that ensure SQLAlchemy components like 
 Engine, Session, schema, etc. do not leak memory under a variety of 
 setup/teardown situations.  But it seems like you're don't yet know if 
 you're experiencing a problem at the Python object level.
 
 
 
 i.e. It loops over the other database (non SQL) reading all the data that I 
 WOULD use to update the SQL database using SA.  When SA is not involved, 
 nothing is happening with the memory.
 
 note that DBAPIs, particularly older versions of MySQLdb, may have memory 
 leaks, and most DBAPIs when asked to fetch a result will load the full set of 
 results into memory before fetchone() is ever called, thus causing a great 
 unconditional increase in the size of memory if you are fetching very large 
 result sets.
 
 
 Any hint on how I can a) see what is being held and b) dump it!
 
 gc.get_objects()
 
 
 -- 
 You received this message because you are subscribed to the Google Groups 
 sqlalchemy group.
 To post to this group, send email to sqlalchemy@googlegroups.com.
 To unsubscribe from this group, send email to 
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at 
 http://groups.google.com/group/sqlalchemy?hl=en.
 

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.