Re: [sqlalchemy] Mimik joinedload for core tables/queries

gbr Wed, 11 Sep 2013 22:14:25 -0700

Never mind. I think I know how this works now. I didn't realise that 
`child_q` gets executed as soon as it's iterated (despite your comment). 
Also, the `child.parent_id` is used as key to fill the dict...


On Thursday, September 12, 2013 2:54:47 PM UTC+10, gbr wrote:
>
> I've still got a question...
>
>     
>>     # This is creating an identity map (parent id -> children list), but how 
>> do we 
>>     # know the `parent.id` at this point? The query hasn't been issued yet...
>>     collections = dict((k, list(v)) for k, v in groupby(
>>                     child_q, 
>>                     lambda x:tuple([getattr(x, c.key) for c in remote_cols])
>>                 ))
>>
>>
> child_q is a Query, which means it's an iterator.  groupby() is an 
> itertools helper that also is an iterator.  when dict(...list(v)...) is 
> invoked, the iterator is run and child_q is emitted as SQL to the database, 
> results are returned.
>
>
> This results in only one query being issued, right? I am unclear on how 
> the mapping works. I guess this is creating a "parent_id -> [child, ...]" 
> map, but at the point of this line(s) the `parent_id` isn't known yet (as 
> the parent query wasn't executed yet). What's the key in `collection` then? 
> Or maybe I didn't understand when the actual query is being executed...
>
>
>
> On Thursday, September 12, 2013 11:54:20 AM UTC+10, gbr wrote:
>>
>> `select_entity_from` finally did the trick. I did
>>
>> qry = 
>> session.query(child_query).select_entity_from(parent_query).join(child_query,
>>  
>> child_query.c.parent_id==parent_query.c.id)
>>
>> Thanks a lot for your help!
>>
>> On Wednesday, September 4, 2013 2:19:32 PM UTC+10, Michael Bayer wrote:
>>>
>>>
>>> On Sep 3, 2013, at 11:03 PM, gbr <doub...@directbox.com> wrote:
>>>
>>> Thanks. That's quite an interesting piece of code. There's a bit of 
>>> magic happening in this code and it's not quite compatible for my use case 
>>> (use of queries instead of tables, no ORM mapping), so allow me to ask some 
>>> questions. I've annotated the code, so perhaps you can correct any of my 
>>> assumptions that are wrong. My aim is to apply a similar concept to two 
>>> queries that are not mapped to a class.
>>>
>>>
>>>
>>> keep in mind any ORM query has an accessor called .statement which will 
>>> give you the core select() construct.
>>>
>>>
>>> def disjoint_load(query, attr):
>>>     # This is just to extract the join condition.
>>>     target = attr.prop.mapper
>>>     local_cols, remote_cols = zip(*attr.prop.local_remote_pairs)
>>>     
>>>     # As far as I can tell, this creates a SELECT from the original parent 
>>> query.
>>>     # I'm not sure how this join works, as `attr` is a reference to 
>>>     # `Parent.children` (no a condition), but I guess I could replace it 
>>>     # with a condition that I pass in to the function.
>>>     # The `order_by` may not be necessary...
>>>     # Question: Does this also work if `target` already is a select query 
>>> containing a CTE?
>>>     child_q = query.from_self(target).join(attr).order_by(*remote_cols)
>>>
>>>
>>> "Parent.children" is as good as a (target, onclause) for Query.join() - 
>>> see the examples in the tutorial for how this is used.
>>>
>>> as far as a CTE, specifics will affect this but you can often use 
>>> query.select_entity_from(stmt) and the Query will use "stmt" in the place 
>>> of the original Entity.
>>>
>>>
>>>     if attr.prop.order_by:
>>>         # No idea why/what this does. Is this necessary?
>>>         child_q = child_q.order_by(*attr.prop.order_by)
>>>
>>> this is maintaining the "order_by" of the relationship(), if one was 
>>> present.
>>>
>>>     
>>>     # This is creating an identity map (parent id -> children list), but 
>>> how do we 
>>>     # know the `parent.id` at this point? The query hasn't been issued 
>>> yet...
>>>     collections = dict((k, list(v)) for k, v in groupby(
>>>                     child_q, 
>>>                     lambda x:tuple([getattr(x, c.key) for c in remote_cols])
>>>                 ))
>>>
>>>
>>> child_q is a Query, which means it's an iterator.  groupby() is an 
>>> itertools helper that also is an iterator.  when dict(...list(v)...) is 
>>> invoked, the iterator is run and child_q is emitted as SQL to the database, 
>>> results are returned.
>>>
>>>
>>>   
>>>     # `engine.echo=True` revealed that this is issuing 2 queries (which is 
>>> what I want)
>>>     # The order is (1) query for children (joining on parent query), (2) 
>>> parent query
>>>     # How/where is the children query attached to the parent query and 
>>> where is it sent?
>>>     parents = query.all()
>>>
>>>
>>> well in the example here the child_q is just run right above before 
>>> query.all()....
>>>
>>>     
>>>     # This does the final assignment of 'list of children per parent' -> 
>>> parent.children
>>>     for p in parents:
>>>         attributes.set_committed_value(
>>>             p, 
>>>             attr.key, 
>>>             collections.get(
>>>                 tuple([getattr(p, c.key) for c in local_cols]), 
>>>                 ())
>>>         )
>>>     return parents
>>>
>>>
>>>
>>>
>>> This is pretty much what I was looking for, but it needs a bit of 
>>> tinkering to work for me. Do you think it's advisable to use some dummy 
>>> classes to map the two queries to in order to reuse as much as possible 
>>> from the above (or adapt it to work with select queries)? What would be the 
>>> implications in terms of performance (would any of the ORM features such as 
>>> attribute tracking, identity map, etc. that I don't necessarily need be 
>>> used in such a case)?
>>>
>>>
>>> assembling an ORM Query is more expensive than assembling a core 
>>> select(), but not much.  as far as the load overhead, if the Query is told 
>>> to load individual columns, that overhead goes down to be very comparable 
>>> to that of the ResultProxy itself (returns plain tuples), or you can 
>>> execute the .statement returned by Query using execute().
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Mimik joinedload for core tables/queries

Reply via email to