Hello all:

We are running into a very strange memory leak issue.  Here are the details:

- The co executes select statements directly on a table using SA
0.5.2, Postgres 8.3, Psycopg2, all running on Linux (Ubuntu 8.04)
- Querying a table with approximately 70000 records
- When we query the table, convert the results to a list, and then
free both the list and the results proxy, memory seems to stay used
- If we query the table with ever increasing size limits (limit
clause), the amount of memory consumed increases
- If we query with a limit of say 50000 and then perform smaller
queries, the smaller queries do not increase the amount of memory
consumed overall
- In other words, the amount of memory caps out in a way that looks
like it is linear with the maximum number of table rows ever returned

It looks like something in SA, Psycopg2, or PG is holding references
to the returned rows but we have not been able to track it down.  I am
working on a completely standalone application to demonstrate the
issue.  Before I went much further though I wanted to post this
question and code sample (not standalone, but shows the commands we
are using) in case anyone sees something fishy.

Any ideas?

Thanks,
Allen

----------------------------------------------------------------------------------
import os, sys, gc
import psycopg2
import sqlalchemy as sa
import sqlalchemy.exc

class DataProvider(object):
   def __init__(self, dbUri, tableName):
      self._dbUri       = dbUri
      self._dbEngine    = None
      self._metadata    = None
      self._table       = None
      self._tableName   = tableName

   def initialize(self):
      self._dbEngine = sa.create_engine(self._dbUri)
      self._metadata = sa.MetaData()
      self._metadata.bind = self._dbEngine

      # Build up the table metadata reflectively
      name_parts = self._tableName.split('.')

      if(len(name_parts) > 1):
         self._table = sa.Table(name_parts[1], self._metadata, autoload = True,
                                schema = name_parts[0])
      else:
         self._table = sa.Table(name_parts[0], self._metadata, autoload = True)

   def query(self, limit = None):
      s = self._table.select()

      if limit is not None:
         s = s.limit(limit)

      print "objs before: ", len(gc.get_objects())
      print "garbage: ", len(gc.garbage)
      result_proxy = s.execute()
      result_objects = [x for x in result_proxy]
      #print "num results: ", len(result_objects)
      result_proxy.close()
      result_objects = []
      results = []
      gc.collect(2)
      print "objs after: ", len(gc.get_objects())
      print "garbage: ", len(gc.garbage)
      return results

db_uri          = sys.argv[0]
collection_name = sys.argv[1]

prov = DataProvider(db_uri, collection_name)
prov.initialize()

limits = [10000, 20000, 30000, 40000, 50000, 60000,
            10000, 20000, 30000, 40000, 50000, 60000]
# Query
for x in range(10):
   limit = limits[x]
   print "----[ %s ] limit: %s ----" % (x, limit)
   gc.collect(2)
   printMemoryInformation()
   features = prov.query(limit = limit)
   features = None
   gc.collect(2)
   printMemoryInformation()

----------------------------------------------------------------------------------

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to