[sqlalchemy] Re: "Incremental" populate_existing behaviour

Jasper K Wed, 20 Oct 2010 13:01:38 -0700

On Oct 20, 2:37 pm, Michael Bayer <mike...@zzzcomputing.com> wrote:
> On Oct 20, 2010, at 1:34 PM, Jasper K wrote:
>
> > Hi,
>
> > We have a use case where we would like to have "incremental updates"
> > to existing session objects (ie. populate_existing()-type behaviour
> > where existing object's lazy-loaded attributes are not reset by
> > subsequent populate_existing queries).
>
> If I were to implement that, I'd issue Session.expire(obj, ['attrname']) on 
> those attributes which I'd like to be "populated", and then just query 
> normally, not using populate_existing(), which was always a "workaround" 
> feature introduced before we had comprehensive expiration support.
>
>
>
>
>
> > We have come up with a solution by implementing a subclassed
> > LazyLoader strategy that returns a No-op function in the
> > "create_row_processor" (basically skipping line 474 in
> > sqlalchemy.orm.strategies in version 0.5.8).
> > Is this a safe place to
> > put this functionality? Note that this session is never used for
> > flushing, it is used as a read-only caching session.
>
> > from sqlalchemy.orm.strategies import LazyLoader
> > class IncrementalLazyLoader(LazyLoader):
>
> >    def create_row_processor(self, *args, **kwargs):
> >        if self.is_class_level:
> >            def new_execute(state, dict_, row, **flags):
> >                pass
> >            return (new_execute, None)
> >        else:
> >            return super(IncrementalLazyLoader,
> > self).create_row_processor(*args, **kwargs)
>
> It's definitely not safe to write a LoaderStrategy right now.  The current 
> interface of "return a two tuple" is something that ends up changing all the 
> time, and its actually a 3-tuple in the current tip for 0.6.5.  I'd like to 
> come up with some way to make the LoaderStrategy interface have a more 
> resilient API but it also operates at the most performance-critical positiion 
> in the whole ORM, which is why its API is not easily subclassable (it used to 
> be just a simple "populate_instance(self, obj)" type of thing).    I 
> originally conceived of LS being something very public and subclassable, but 
> I've never seen anyone actually try - both because there's not too many 
> reasons for it and also because its rare to see someone brave enough to wade 
> in there (so I commend you for that).   So it is somewhat ironic that just as 
> I decided "well it looks like nobody uses this on the outside" and committed 
> a non-compatible change someone came along doing it :).      
>
> As far as a solution here, assuming you don't like the expiration idea, I'm 
> not seeing something that works very well for 0.6.    The most obvious is a 
> publically-subclassable LoaderStrategy that presents a constant interface 
> which wouldn't change.  Except, in 0.7, we are deprecating *all* of the 
> "subclass X to extend the ORM" classes - they aren't going anywhere but the 
> new way to go will be to register listener functions with known events.   
> There is as yet no event for "populate attribute", though we could certainly 
> add one (though again, I'd need to find a way to sneak it in there without 
> adding latency in the vast majority of cases that don't use this hook).    
> But back to 0.6, something that would work across current 0.6's would be to 
> get the tuple from the superclass, and return a tuple of (None, None, [None]) 
> based on the size you get back - hacky for sure.   I don't foresee the 
> "create_row_processor" paradigm changing again for 0.6, though I can't be 
> 100% on that.


Unfortunately I don't think the Session.expire(obj, ['attrname']) will
work for our situation. If we ignore the problems relating to forward
compatibility, would the above LazyStrategy work for what we are
attempting to do without adversely affecting the internals of
slqlachemy?

For future releases, what about providing different
populate_existing() functionality with a query option that doesn't
completely reset existing instances?

We actually found another issue relating to the populate_existing flag
so perhaps more information about our situation will help towards a
better overall solution. First I will explain our setup and then
explain the second problem we have with populate_existing the way it
works now. Like I mentioned in the first post we have a cache of read-
only objects that are attached to a read-only "cache session". These
objects live in memory throughout the life of the application. Updates
to these objects are done through "user sessions" that have
MappingExtension listeners which track changed objects and then merges
those changes over to the "cache session" using
session.merge(dont_load=True), all under concurrent modification
locks. Our main goal is to reduce as many unnecessary round trips to
the database that we can by using the cached data. Therefore, we would
like to "opportunistically" scrape fresh data from any SQL queries
that are done in the "cache session" and update the read-only objects
with the freshest data. Currently this is being accomplished by
supplying the "cache session" with a CustomQuery class (using the
query_cls parameter) that has the _populate_existing flag always set.
However this has two unfortunate side-effects: the first I mentioned
in my other post (where all previously loaded lazy attributes get
reset by the LazyStrategy), the second is that Query._get will skip
checking the session identity map for an object because of the
populate_existing flag.

Our way around this second problem is to override the Query.get
functionality to ignore the populate_existing flag and check the
identity map anyway. While this "hackiness" is something we would like
to avoid, I suppose it is not altogether possible given the complexity
of multiple user requirements! Maybe you have some ideas for how to
improve this specific use case?


-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

[sqlalchemy] Re: "Incremental" populate_existing behaviour

Reply via email to