Re: [sqlalchemy] Eager loading hates single table inheritance

Jimmy Yuen Ho Wong Mon, 09 Jan 2012 17:56:27 -0800

On 10/1/12 12:19 AM, Michael Bayer wrote:
> A common theme in SQLAlchemy is that, despite all the flak we get for being 
> "complicated", SQLAlchemy is actually very simple.   It has a handful of 
> constructs which seek to do exactly the same thing in exactly the same way, 
> as consistently as possible.   So we sometimes get requests for "SQLAlchemy 
> should figure out XYZ" and it's like, well not really, that would be really 
> complicated.
Ah ha don't get me wrong but I didn't say SA was complicated, I said my
actual queries used in production is complicated. What's I've shown you
was just a simplified version of what I've narrowed down. SA sure had a
large surface area to get started because it does so much more, but I've
grown to love it. This is really the first time I tried to do something
complicated like this with SA and I was testing if SA would be smart
about this. I can't really live without SA now.
> you'd use two joinedload() twice for that.     So all of the examples where 
> you're distinguishing xyz_all() from xyz(), all the same thing.   
>
> It's silently ignored now so http://www.sqlalchemy.org/trac/ticket/2370 is 
> added.  If I can get to it in 0.7 it will emit a warning, will raise an error 
> in 0.8.
Ah, thanks for telling me this. Didn't notice this from the docs.
> In the subqueryload_all()/subqueryload() example, you'll note the second one 
> with two separate subqueryload() calls does in fact correctly do the 
> "subqueryload" twice.   There's no way SQLA would ever "figure out" 
> automatically that they're against the same table and then join the two 
> queries together, decisionmaking like would be enormously complicated as well 
> as the mechanics of how to get a single loader to coordinate among two 
> relationships.   If you use just one relationship() to SearchOption then 
> provide filtered accessors, then you get exactly the optimization you're 
> looking for.
Is it possible to include a special flag to the subqueryload() call to
tell SA to optimize for this case? SA already knows about the class
hierarchy to be able to distinguish between the classes and how they map
to the rows returned, changing the model mapping is not always feasible.
In fact, if SA could do this, subqueryload() will be the optimal
solution for this use case. joinedload(), even when it works, it's still
too wasteful. Using events will work for now, but it's not as obvious.
SA already is so smart about things, surely it can be smarter no? :P
> Ticket 2120 calls for at least an option "nested_joins=True", specifying that 
> the more correct/efficient system of nesting joins should be used.   This is 
> all 0.8 stuff, as the joined eager loading code would be destabilized by this 
> - if it turns out to be an isolated option it could move to 0.7.   Your set 
> of tests here makes me more interested in the issue though so perhaps we'll 
> see if I have time to try some things out.
Does it mean that the only way to eager load reliably in this
many-to-many single table inheritance use case, depending on the data,
is joinedload(innerjoin=True) for uselist=False relationships for now?
In my case, I can just use innerjoin=True all the way, but there may be
cases where the code has to switch between innerjoin and left join
depending on the relationships.
>
> Finally the contains_eager version.   SQLAlchemy again expects two distinct 
> sets of columns for each relationship, so you must join to the table twice:
>
>     sa1 = aliased(OriginOption)
>     sa2 = aliased(FoodOption)
>
>     p = session.query(Product)\
>             .join(sa1, Product.origin)\
>             .join(sa2, Product.foods)\
>             .options(contains_eager(Product.origin, alias=sa1),
>                      contains_eager(Product.foods, alias=sa2))\
>             .filter(Product.id == 2).one()
>
>
> Same theme here, you're hoping SQLAlchemy can "figure out" something in 
> Python, i.e. that only "certain" rows being routed to 
> Product.origin/Product.foods should be used for each, but it's not that 
> complicated.     It relies upon SQL to present it with the correct data 
> geometry and assumes it is correct.  Second-guessing what it gets back from 
> SQL to check, "oh is this some special 1% edge case where I might have to 
> filter extra types that I'm not supposed to receive here?" wouldn't be 
> efficient or consistent with how everything else works.
> consistently.
Ah thanks for this solution, but again, self joining for each
relationship is too inefficient. The point is, as a user, I think the
query with a single join already contains all the data needed to
reconstruct collections. Plus, I've already told SA that the results
will contain the rows that it can populate the collections with, why
can't it do that? That's the thought that went through my head.


Again, I think if contains_eager() accepts a flag that tells it to
optimize for this case, the code to handle this will be isolated.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Re: [sqlalchemy] Eager loading hates single table inheritance

Reply via email to