On Sep 23, 2010, at 8:18 PM, Anthony Tran wrote: > Hi everyone, > > I'm trying to do a query where I joinedload with a polymorphic child table. > The child table is using a joined table inheritance scheme. I'm using > with_polymorphic = '*' to accomplish the eager loading of polymorphic types. > The query that SQLAlchemy generates results in a sub select that returns all > the child rows. Here's the setup: > > > Here's the resulting SQL query: > > SELECT parent_table.id AS parent_table_id, > anon_1.child_base_table_id AS anon_1_child_base_table_id, > anon_1.child_base_table_parent_id AS anon_1_child_base_table_parent_id, > anon_1.child_base_table_child_type AS > anon_1_child_base_table_child_type, > anon_1.child1_table_child_id AS anon_1_child1_table_child_id, > anon_1.child1_table_child1_data AS anon_1_child1_table_child1_data, > anon_1.child2_table_child_id AS anon_1_child2_table_child_id, > anon_1.child2_table_child2_data AS anon_1_child2_table_child2_data > FROM parent_table > LEFT OUTER JOIN (SELECT > child_base_table.id AS child_base_table_id, > child_base_table.parent_id AS > child_base_table_parent_id, > child_base_table.child_type AS > child_base_table_child_type, > child1_table.child_id AS > child1_table_child_id, > child1_table.child1_data AS > child1_table_child1_data, > child2_table.child_id AS > child2_table_child_id, > child2_table.child2_data AS > child2_table_child2_data > FROM child_base_table > LEFT OUTER JOIN child1_table > ON child_base_table.id = child1_table.child_id > LEFT OUTER JOIN child2_table > ON child_base_table.id = > child2_table.child_id) > AS > anon_1 > ON parent_table.id = anon_1.child_base_table_parent_id > WHERE parent_table.id = 999999 > > Note that the inner select has no WHERE clause which results in selecting all > of the child rows. Without the with_polymorphic option, SQLAlchemy doesn't > produce the inner select. Using subquery_load causes the same problem. Is > there a way to prevent this from happening and eager load the polymorphic > tables?
So, interesting terminology you used above - "a sub select that returns all the child rows". The SELECT that is the right hand target of the LEFT OUTER JOIN doesn't actually "return" any rows to your application - the outer one does. The subquery only represents "the set of all child rows". In relational theory, there is no difference between these two statements : select * from parent join child on parent.id = child.parent_id select * from parent join (select * from child) as c1 on parent.id=c1.parent_id SQLAlchemy's normal behavior makes great usage of this natural effect. The "set of all rows in 'child'" is the same as the "set of all rows in the set of all rows in 'child'". A good SQL optimizer, like that of Postgresql or Oracle, knows this, and in fact with PG my experience is that it's extremely difficult to get it to come up with a different execution plan by moving subqueries out into joins and such - it almost always figures out the common relational structure underneath (however, if you're on PG, I can't say for sure without trying how well the above query would optimize). SQLAlchemy's eager loader treats all "join targets" as the same thing - only the kind of selectable which comprises the "join target" changes. Now, that's the theory part. The practical part is, if you're using MySQL, a query like the above is going to hit you hard and mercilessly, as their optimizer is the worst I've ever seen. So my advice here depends strongly on which backend you're using. To be blunt, if you're using MySQL, I would not be using joined table inheritance for anything critical, at all, period. MySQL completely sucks at joins and it always will. SQLAlchemy is heavily oriented around relational theory, whereas MySQL is the database where they didn't think foreign keys were really that important until ten years into the project. If you're using Postgresql, I would be checking what EXPLAIN tells you - it may or may not make any difference if a WHERE clause were stuck inside the subquery. I will say about PG that it works much better on inner joins than outer ones, though that's hard to use when you're trying to scan out on polymorphic tables like the above. So lets say you're on MySQL or PG, you can't change your schema, you've observed that yes, the planner really is performing more poorly because the subquery isn't pre-limited (I really think thats only MySQL's quirk, though), what do you do ? Well then you have to construct the JOIN by hand and use contains_eager(): from sqlalchemy.orm import contains_eager subq = session.query(Child).filter(Child.parent_id==99999).subquery() query(Parent).outerjoin((subq, Parent.children)).filter(Parent.id=99999).options(contains_eager(Parent.children, alias=subq)).one() so the above is an example of more explicit usage when the default relational-theory approach is not practical. I haven't tested the snippet above but its built on patterns that are very well tested so hopefully should work out of the gate. > > Thanks! > Anthony > > -- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To post to this group, send email to sqlalch...@googlegroups.com. > To unsubscribe from this group, send email to > sqlalchemy+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/sqlalchemy?hl=en. -- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To post to this group, send email to sqlalch...@googlegroups.com. To unsubscribe from this group, send email to sqlalchemy+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.