I've still got a question...

    
>     # This is creating an identity map (parent id -> children list), but how 
> do we 
>     # know the `parent.id` at this point? The query hasn't been issued yet...
>     collections = dict((k, list(v)) for k, v in groupby(
>                     child_q, 
>                     lambda x:tuple([getattr(x, c.key) for c in remote_cols])
>                 ))
>
>
child_q is a Query, which means it's an iterator.  groupby() is an 
itertools helper that also is an iterator.  when dict(...list(v)...) is 
invoked, the iterator is run and child_q is emitted as SQL to the database, 
results are returned.


This results in only one query being issued, right? I am unclear on how the 
mapping works. I guess this is creating a "parent_id -> [child, ...]" map, 
but at the point of this line(s) the `parent_id` isn't known yet (as the 
parent query wasn't executed yet). What's the key in `collection` then? Or 
maybe I didn't understand when the actual query is being executed...



On Thursday, September 12, 2013 11:54:20 AM UTC+10, gbr wrote:
>
> `select_entity_from` finally did the trick. I did
>
> qry = 
> session.query(child_query).select_entity_from(parent_query).join(child_query, 
> child_query.c.parent_id==parent_query.c.id)
>
> Thanks a lot for your help!
>
> On Wednesday, September 4, 2013 2:19:32 PM UTC+10, Michael Bayer wrote:
>>
>>
>> On Sep 3, 2013, at 11:03 PM, gbr <doub...@directbox.com> wrote:
>>
>> Thanks. That's quite an interesting piece of code. There's a bit of magic 
>> happening in this code and it's not quite compatible for my use case (use 
>> of queries instead of tables, no ORM mapping), so allow me to ask some 
>> questions. I've annotated the code, so perhaps you can correct any of my 
>> assumptions that are wrong. My aim is to apply a similar concept to two 
>> queries that are not mapped to a class.
>>
>>
>>
>> keep in mind any ORM query has an accessor called .statement which will 
>> give you the core select() construct.
>>
>>
>> def disjoint_load(query, attr):
>>     # This is just to extract the join condition.
>>     target = attr.prop.mapper
>>     local_cols, remote_cols = zip(*attr.prop.local_remote_pairs)
>>     
>>     # As far as I can tell, this creates a SELECT from the original parent 
>> query.
>>     # I'm not sure how this join works, as `attr` is a reference to 
>>     # `Parent.children` (no a condition), but I guess I could replace it 
>>     # with a condition that I pass in to the function.
>>     # The `order_by` may not be necessary...
>>     # Question: Does this also work if `target` already is a select query 
>> containing a CTE?
>>     child_q = query.from_self(target).join(attr).order_by(*remote_cols)
>>
>>
>> "Parent.children" is as good as a (target, onclause) for Query.join() - 
>> see the examples in the tutorial for how this is used.
>>
>> as far as a CTE, specifics will affect this but you can often use 
>> query.select_entity_from(stmt) and the Query will use "stmt" in the place 
>> of the original Entity.
>>
>>
>>     if attr.prop.order_by:
>>         # No idea why/what this does. Is this necessary?
>>         child_q = child_q.order_by(*attr.prop.order_by)
>>
>> this is maintaining the "order_by" of the relationship(), if one was 
>> present.
>>
>>     
>>     # This is creating an identity map (parent id -> children list), but how 
>> do we 
>>     # know the `parent.id` at this point? The query hasn't been issued yet...
>>     collections = dict((k, list(v)) for k, v in groupby(
>>                     child_q, 
>>                     lambda x:tuple([getattr(x, c.key) for c in remote_cols])
>>                 ))
>>
>>
>> child_q is a Query, which means it's an iterator.  groupby() is an 
>> itertools helper that also is an iterator.  when dict(...list(v)...) is 
>> invoked, the iterator is run and child_q is emitted as SQL to the database, 
>> results are returned.
>>
>>
>>   
>>     # `engine.echo=True` revealed that this is issuing 2 queries (which is 
>> what I want)
>>     # The order is (1) query for children (joining on parent query), (2) 
>> parent query
>>     # How/where is the children query attached to the parent query and where 
>> is it sent?
>>     parents = query.all()
>>
>>
>> well in the example here the child_q is just run right above before 
>> query.all()....
>>
>>     
>>     # This does the final assignment of 'list of children per parent' -> 
>> parent.children
>>     for p in parents:
>>         attributes.set_committed_value(
>>             p, 
>>             attr.key, 
>>             collections.get(
>>                 tuple([getattr(p, c.key) for c in local_cols]), 
>>                 ())
>>         )
>>     return parents
>>
>>
>>
>>
>> This is pretty much what I was looking for, but it needs a bit of 
>> tinkering to work for me. Do you think it's advisable to use some dummy 
>> classes to map the two queries to in order to reuse as much as possible 
>> from the above (or adapt it to work with select queries)? What would be the 
>> implications in terms of performance (would any of the ORM features such as 
>> attribute tracking, identity map, etc. that I don't necessarily need be 
>> used in such a case)?
>>
>>
>> assembling an ORM Query is more expensive than assembling a core 
>> select(), but not much.  as far as the load overhead, if the Query is told 
>> to load individual columns, that overhead goes down to be very comparable 
>> to that of the ResultProxy itself (returns plain tuples), or you can 
>> execute the .statement returned by Query using execute().
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to