On Wed, Feb 22, 2012 at 5:18 PM, Michael Bayer <mike...@zzzcomputing.com> wrote: > > On Feb 22, 2012, at 2:46 PM, Claudio Freire wrote: > >> On Wed, Feb 22, 2012 at 4:29 PM, Michael Bayer <mike...@zzzcomputing.com> >> wrote: >>>> thanks for your reply. I haven't yet tested this with a profiler to see >>>> exactly what exactly is happening, but the bottom line is that the overall >>>> memory use grows with each iteration (or transaction processed), to the >>>> point of grinding the server to a halt, and top shows only the Python >>>> process involved consuming all the memory. >>> >>> yeah like I said that tells you almost nothing until you start looking at >>> gc.get_objects(). If the size of gc.get_objects() grows continuously for >>> 50 iterations or more, never decreasing even when gc.collect() is called, >>> then it's a leak. Otherwise it's just too much data being loaded at once. >> >> I've noticed compiling queries (either explicitly or implicitly) tends >> to *fragment* memory. There seem to be long-lived caches in the PG >> compiler at least. I can't remember exactly where, but I could take >> another look. >> >> I'm talking of rather old versions of SQLA, 0.3 and 0.5. > > > 0.3's code is entirely gone, years ago. I wouldn't even know what silly > things it was doing.
Like I said, I would have to take another look into the matter to validate against 0.7 > In 0.5 and beyond, theres a cache of identifiers for quoting purposes. If > you are creating perhaps thousands of tables with hundreds of columns, all > names being unique, then this cache might start to become a blip on the > radar. For the expected use case of a schema with at most several hundred > tables this should not be a significant size. Fixed schema, but the code did create lots of aliases for dynamic queries. > I don't know much what it means for a Python script to "fragment" memory, and > I don't really think there's some kind of set of Python programming practices > that deterministically link to whether or not a script fragments a lot. Alex > Martelli talks about it here: > http://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python > . The suggestion there is if you truly need to load tons of data into > memory, doing it in a subprocess is the only way to guarantee that memory is > freed back to the OS. I wrote a bit about that[0]. The issue is with long-lived objects, big or small, many or few. > As it stands, there are no known memory leaks in SQLAlchemy itself I can attest to that. I have a backend running 24/7 and, barring some external force, it can keep up for months with no noticeable leaks. > and if you look at our tests under aaa_profiling/test_memusage.py you can see > we are exhaustively ensuring that the size of gc.get_objects() does not grow > unbounded for all sorts of awkward situations. To illustrate potential new > memory leaks we need succinct test cases that illustrate a simple ascending > growth in memory usage. Like I said, it's not a leak situation as much of a fragmentation situation, where long-lived objects in high memory positions can prevent the process' heap from shrinking. [0] http://revista.python.org.ar/2/en/html/memory-fragmentation.html -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalchemy@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.