[sqlalchemy] Re: [PATCH] Filtered one_to_many relationships (Experimental)

Michael Bayer Fri, 23 Mar 2007 09:57:52 -0800


OK, this is actually something people have asked for a lot.  in the  
beginning, recently, etc.  also for different reasons...i.e.  
convenience, or performance, etc.   So, first off let me start by  
illustrating how this use case is done right now.   Assuming your  
Address mapper has a backref "user" to the User mapper, its  just:

        for address in session.query(Address).filter_by(user=someuser).filter 
(address_table.c.postcode == 5000):
                print address.street

and of course, if you dont have the backref, then we go back to the  
regular SQL stuff:

        for address in session.query(Address).filter 
(user.id==someuser.id).filter(address_table.c.postcode == 5000):
                print address.street

in the second example, we are figuring out the "lazy" criterion  
ourselves.  in most cases this is pretty easy to figure out.  but we  
do have some exotic mappings these days, primarily because of  
polymorphic union queries, and it might not be as trivial to make up  
the join condition.  But,  the join conditions and bind params which  
have been calculated by "LazyLoader" are just sitting there, they can  
be currently pulled out with a little non-API attribute access but  
Ive no problem with adding some API-level accessors to get at the  
Query object calculated for a particular property (i.e. what you are  
using in your patch internally).

So if you want to take the above and make it all slick, all you have  
to do is stick the above query inside a property:

        class User(object):
                def _get_addresses(self):
                        return object_session(self).query(Address).filter 
(user.id==someuser.id)
                addresses = property(_get_addresses)

then you can just say pretty much what you want:

        for address in someuser.addresses.filter(address_table.c.postcode ==  
5000):
                print address.street

plus, the "addresses" object is a Query !  so you already have the  
full interface, and it has __iter__ and __getitem__() !  so the list  
operations on the base object work just fine (and they even work like  
SQLObjects)

        for address in someuser.addresses:   # __iter__() fires off the query !

        for address in someuser.addresses[3:5]  # slice applies OFFSET and  
LIMIT !

now lets look at the way your patch does it.

        addresses = user.addresses.filter(address_table.c.postcode == 5000)

seems easy.  right ?  remember that "user" is now in the session.   
anyone else that queries for "user" will get that same User  
instance.  but the rest of the app is going to assume normal  
relationship semantics on that collection....which means:

        print user.addresess   # <-- FAIL  - this code assumes the list is  
complete, its not, its already been loaded incompletely

        print someaddress in user.addresses # <-- FAIL - the address is not  
present

        user.addresses.remove(someaddress) # <-- ERROR - the address is not  
present

        user.addresses.insert(5, someotheraddress) # <-- FAIL - the list is  
incomplete, ordering will be incorrect

        session.flush()  # <-- FAIL - we have to figure out what items were  
added/removed/unchanged from the collection...but the data's  
incomplete !

This is an issue introduced by our usage of sessions, identity maps,  
etc., things that simpler ORMs dont have to worry about.  So thats  
the part of this I dont want to get into (and im hoping you dont,  
either).  Yes, the use case you want is desireable, and ive no  
problem adding in hooks to make it possible.  but No, I dont think it  
should be injected into the base attribute lazy-loading operation.    
i think the semantics of a "Query for related items" and that of a  
"mapped relationship" are just plain different, even though  
implementation-wise on the view side at least they are both "just a  
SQL query".  with the primary issue being that all mutating/uow- 
related operations go out the window with an incomplete collection.   
from a performance point of view,  the manipulation of huge  
collections could be achieved via loading "stub" objects which is  
something discussed long ago, which is how Hibernate addresses it  
(called "extra-lazy loading").  so if we ever did that, thats how we  
would do it (i.e. a complete collection of incomplete objects).

anyway back to the query side, Ive even been considering making folks  
happy with this and adding a new kind of mapped property called, i  
dont know...."query_relation()", "relquery()",  "lazyloader()",  
something like that.  and, right in the core, how often does that  
happen ?  it would copy the setup style of "relation()" but be a lot  
more view oriented.

        mapper(SomeClass, table, properties={
                'addresses':lazyloader(Address)
        })

and there you go !  does the same thing as the manual property code I  
illustrated does, returns a Query from which you can do whatever.

We also had a guy who wanted to do a more complex primaryjoin which  
was confusing the lazyloader.   I showed him how to manually tack on  
a lazy-callable.  but i want to stick that into lazyloader(),  
basically a "callable_" argument that fires off using the attribute's  
lazy-callable mechanism:

        def load_addresses(session, instance):
                return session.query(Address).filter 
(create_some_really_complicated_join_condition)

        mapper(SomeClass, table, properties={
                'addresses':lazyloader(callable_=load_addresses)
        })

the lazyloader() function above is doing almost nothing, its just  
tacking on whatever callable you'd like.   it can return a generative  
Query, Addresses, farm animals, whatever...just a convenient way to  
hook into the attribute manager's lazy-callable logic.

so as long as we can agree on the "its a read-only thing" aspect of  
this, we're good to go.  otherwise you have to define for me how all  
those mutating operations are going to work (and even then, its  
additional core complexity im not sure if i can add to my support-load).

On Mar 23, 2007, at 11:36 AM, Gaetan de Menten wrote:

> Hi list,
>
> The recent work that Michael has done about bringing SelectResults
> functionality to Query made me think: what if that allowed me to do
> what I've always wanted to have ever since I started using ORMs? The
> thing I've wanted to do is to have one-to-many relations that you can
> filter when you access them, and it generates the correct query behind
> the scene without fetching too many rows from the DB.
>
> So I dived into the code a bit deeper than I had up until now... In
> the end, I didn't use the SelectResults stuff at all, but I've managed
> to come up with a working patch to do what I wanted anyway. The most
> interesting part of it is the addition of a DeferredInstrumentedList
> which takes a callable as argument and actually call it only when the
> list is accessed.
>
> This allows to do stuff like that:
>
> for address in user.addresses.filter(address_table.c.postcode ==  
> 5000):
>      print address.street
>
> The patch is quite experimental (I'm pretty sure I broke some stuff).
> Also, it only includes a proof-of-concept filter method, but it should
> be almost trivial to add things like: limit, offset, order_by,
> distinct, list slicing and so on...
>
> And anyway, I'm not sure it's the correct way to go. I _think_ the
> best way to go would be if InstrumentedList would hold a
> "preconfigured" query object, that you could alter the same way you
> can alter the normal query objects, and which would only be executed
> when one access the list elements. But to get to this point would need
> a major refactoring of the code while what I've done is pretty simple.
>
> I also attach an (Elixir-made) example demonstrating the thing.
>
> Michael, if you are interested in this, I'll gladly translate the
> example into a plain SQLAlchemy unittest. And more generally, if there
> is anything I can do (within the limits of my knowledge of SQLAlchemy)
> to get this done the "correct" way and included into the trunk, please
> tell me. I'm so thrilled with this new toy, I hope it'll be possible
> to include it ! :)
>
> -- 
> Gaëtan de Menten
> http://openhex.org
>
> >
> <deferred_list_relation.diff>
> from elixir import *
>
> class A(Entity):
>     has_field('name', String(30))
>     has_many('b', of_kind='B')
>
> class B(Entity):
>     has_field('name', String(30))
>     belongs_to('a', of_kind='A')
>     has_field('extra', Integer)
>
> metadata.connect('sqlite:///')
> create_all()
>
> a1 = A(name='a1')
> a2 = A(name='a2')
> b1 = B(name='b1', a=a1, extra=10)
> b2 = B(name='b2', a=a2)
> b3 = B(name='b3', a=a1, extra=5)
>
> objectstore.flush()
> objectstore.clear()
>
> metadata.engine.echo = True
> a = A.get_by(name='a1')
> print [b.name for b in a.b.filter(B.c.extra < 6)]
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: [PATCH] Filtered one_to_many relationships (Experimental)

Reply via email to