Never mind. I think I know how this works now. I didn't realise that `child_q` gets executed as soon as it's iterated (despite your comment). Also, the `child.parent_id` is used as key to fill the dict...
On Thursday, September 12, 2013 2:54:47 PM UTC+10, gbr wrote: > > I've still got a question... > > >> # This is creating an identity map (parent id -> children list), but how >> do we >> # know the `parent.id` at this point? The query hasn't been issued yet... >> collections = dict((k, list(v)) for k, v in groupby( >> child_q, >> lambda x:tuple([getattr(x, c.key) for c in remote_cols]) >> )) >> >> > child_q is a Query, which means it's an iterator. groupby() is an > itertools helper that also is an iterator. when dict(...list(v)...) is > invoked, the iterator is run and child_q is emitted as SQL to the database, > results are returned. > > > This results in only one query being issued, right? I am unclear on how > the mapping works. I guess this is creating a "parent_id -> [child, ...]" > map, but at the point of this line(s) the `parent_id` isn't known yet (as > the parent query wasn't executed yet). What's the key in `collection` then? > Or maybe I didn't understand when the actual query is being executed... > > > > On Thursday, September 12, 2013 11:54:20 AM UTC+10, gbr wrote: >> >> `select_entity_from` finally did the trick. I did >> >> qry = >> session.query(child_query).select_entity_from(parent_query).join(child_query, >> >> child_query.c.parent_id==parent_query.c.id) >> >> Thanks a lot for your help! >> >> On Wednesday, September 4, 2013 2:19:32 PM UTC+10, Michael Bayer wrote: >>> >>> >>> On Sep 3, 2013, at 11:03 PM, gbr <doub...@directbox.com> wrote: >>> >>> Thanks. That's quite an interesting piece of code. There's a bit of >>> magic happening in this code and it's not quite compatible for my use case >>> (use of queries instead of tables, no ORM mapping), so allow me to ask some >>> questions. I've annotated the code, so perhaps you can correct any of my >>> assumptions that are wrong. My aim is to apply a similar concept to two >>> queries that are not mapped to a class. >>> >>> >>> >>> keep in mind any ORM query has an accessor called .statement which will >>> give you the core select() construct. >>> >>> >>> def disjoint_load(query, attr): >>> # This is just to extract the join condition. >>> target = attr.prop.mapper >>> local_cols, remote_cols = zip(*attr.prop.local_remote_pairs) >>> >>> # As far as I can tell, this creates a SELECT from the original parent >>> query. >>> # I'm not sure how this join works, as `attr` is a reference to >>> # `Parent.children` (no a condition), but I guess I could replace it >>> # with a condition that I pass in to the function. >>> # The `order_by` may not be necessary... >>> # Question: Does this also work if `target` already is a select query >>> containing a CTE? >>> child_q = query.from_self(target).join(attr).order_by(*remote_cols) >>> >>> >>> "Parent.children" is as good as a (target, onclause) for Query.join() - >>> see the examples in the tutorial for how this is used. >>> >>> as far as a CTE, specifics will affect this but you can often use >>> query.select_entity_from(stmt) and the Query will use "stmt" in the place >>> of the original Entity. >>> >>> >>> if attr.prop.order_by: >>> # No idea why/what this does. Is this necessary? >>> child_q = child_q.order_by(*attr.prop.order_by) >>> >>> this is maintaining the "order_by" of the relationship(), if one was >>> present. >>> >>> >>> # This is creating an identity map (parent id -> children list), but >>> how do we >>> # know the `parent.id` at this point? The query hasn't been issued >>> yet... >>> collections = dict((k, list(v)) for k, v in groupby( >>> child_q, >>> lambda x:tuple([getattr(x, c.key) for c in remote_cols]) >>> )) >>> >>> >>> child_q is a Query, which means it's an iterator. groupby() is an >>> itertools helper that also is an iterator. when dict(...list(v)...) is >>> invoked, the iterator is run and child_q is emitted as SQL to the database, >>> results are returned. >>> >>> >>> >>> # `engine.echo=True` revealed that this is issuing 2 queries (which is >>> what I want) >>> # The order is (1) query for children (joining on parent query), (2) >>> parent query >>> # How/where is the children query attached to the parent query and >>> where is it sent? >>> parents = query.all() >>> >>> >>> well in the example here the child_q is just run right above before >>> query.all().... >>> >>> >>> # This does the final assignment of 'list of children per parent' -> >>> parent.children >>> for p in parents: >>> attributes.set_committed_value( >>> p, >>> attr.key, >>> collections.get( >>> tuple([getattr(p, c.key) for c in local_cols]), >>> ()) >>> ) >>> return parents >>> >>> >>> >>> >>> This is pretty much what I was looking for, but it needs a bit of >>> tinkering to work for me. Do you think it's advisable to use some dummy >>> classes to map the two queries to in order to reuse as much as possible >>> from the above (or adapt it to work with select queries)? What would be the >>> implications in terms of performance (would any of the ORM features such as >>> attribute tracking, identity map, etc. that I don't necessarily need be >>> used in such a case)? >>> >>> >>> assembling an ORM Query is more expensive than assembling a core >>> select(), but not much. as far as the load overhead, if the Query is told >>> to load individual columns, that overhead goes down to be very comparable >>> to that of the ResultProxy itself (returns plain tuples), or you can >>> execute the .statement returned by Query using execute(). >>> >> -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at http://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/groups/opt_out.