[sqlalchemy] Re: Query caching and garbage collection of related instances

Michael Bayer Sun, 08 Nov 2009 11:29:46 -0800


On Nov 7, 2009, at 6:03 PM, Adam Dziendziel wrote:


>
> Hi,
>
> I am trying to use the query caching solution described here:
> http://svn.sqlalchemy.org/sqlalchemy/trunk/examples/query_caching/per_session.py
>
> In most cases it works, the returned records are cached, I store them
> in a LRU cache modeled after http://code.activestate.com/recipes/498245/
>
> However, when I run a long running operation, which operates on
> hundreds of other records, apparently the garbage collection is run on
> the session's weak-referencing identity map. The cache keeps the
> returned records, but other eagerly loaded related instances of the
> returned records are lost.

that's not possible, since those records are strongly referenced by  
their cached parents (see below), which in turn are strongly  
referenced by the cache.    The GC will only collect items that  have  
a strong reference count of zero.


> The ORM issues queries to load them again
> from the database. I understand that there are no strong references
> between an instance and other related instances.

Assuming you are not using "dynamic" loaders, that's not correct.   
When a collection or attribute is "eagerly loaded", it is placed  
within the parent's __dict__ during the load operation.   Similarly  
for "lazy loads", once the lazily loading attribute is referenced, the  
then loaded collection or attribute is placed in the parent's __dict__.

In this case it sounds more like the parents have collections or  
attributes which are to be lazily loaded upon first access, so the  
link between parent/child hasn't yet been established.   In the case  
of a many-to-one, you can say "parent.child" and not see any SQL, even  
though the "lazy load" operation was invoked, because a simple  
identity lookup is performed in the session.


>
> What is the best solution to keep related instances in a session?

sounds like you need to ensure your eager loads are working properly  
(or another use my other suggestion below).

>
> If I create a session with weak_identity_map=False, then during my
> long running operation I will run out of memory, unless I expunge
> unused records, however, it is easy to miss one record and the
> identity map will be growing anyway.

I recommend against using that option since we're trying to decide if  
we should just drop it across the board, its pretty legacy.

>
> Is there possible to get a list of referenced instances of another
> instance, so that I could store the list together with the instance in
> the MRU cache?

What I usually do when i want to ensure what gets cached (since im  
usually serializing into memcached), and i dont want to worry what the  
particular eager loading configuration is, is to make a method like  
"full_load()" which ensures all the important attributes and  
collections are present.  this will issue lazy loads for anything that  
wasn't already loaded:

def full_load(self):
     self.collection1
     self.some_reference
     return self

However, if you are truly eager loading all of those attributes then  
this step is unnecessary.



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Query caching and garbage collection of related instances

Reply via email to