On Sep 23, 2010, at 8:18 PM, Anthony Tran wrote:

> Hi everyone,
> 
> I'm trying to do a query where I joinedload with a polymorphic child table. 
> The child table is using a joined table inheritance scheme. I'm using 
> with_polymorphic = '*' to accomplish the eager loading of polymorphic types. 
> The query that SQLAlchemy generates results in a sub select that returns all 
> the child rows. Here's the setup:
> 
> 
> Here's the resulting SQL query:
> 
> SELECT parent_table.id                   AS parent_table_id,
>       anon_1.child_base_table_id         AS anon_1_child_base_table_id,
>       anon_1.child_base_table_parent_id  AS anon_1_child_base_table_parent_id,
>       anon_1.child_base_table_child_type AS 
> anon_1_child_base_table_child_type,
>       anon_1.child1_table_child_id       AS anon_1_child1_table_child_id,
>       anon_1.child1_table_child1_data    AS anon_1_child1_table_child1_data,
>       anon_1.child2_table_child_id       AS anon_1_child2_table_child_id,
>       anon_1.child2_table_child2_data    AS anon_1_child2_table_child2_data
> FROM   parent_table
>       LEFT OUTER JOIN (SELECT
>                           child_base_table.id         AS child_base_table_id,
>                               child_base_table.parent_id  AS 
> child_base_table_parent_id,
>                               child_base_table.child_type AS 
> child_base_table_child_type,
>                               child1_table.child_id       AS 
> child1_table_child_id,
>                               child1_table.child1_data    AS 
> child1_table_child1_data,
>                               child2_table.child_id       AS 
> child2_table_child_id,
>                               child2_table.child2_data    AS 
> child2_table_child2_data
>                        FROM   child_base_table
>                               LEFT OUTER JOIN child1_table
>                                 ON child_base_table.id = child1_table.child_id
>                               LEFT OUTER JOIN child2_table
>                                 ON child_base_table.id = 
> child2_table.child_id)
>                       AS
>                                           anon_1
>         ON parent_table.id = anon_1.child_base_table_parent_id
> WHERE  parent_table.id = 999999
> 
> Note that the inner select has no WHERE clause which results in selecting all 
> of the child rows. Without the with_polymorphic option, SQLAlchemy doesn't 
> produce the inner select. Using subquery_load causes the same problem. Is 
> there a way to prevent this from happening and eager load the polymorphic 
> tables? 

So, interesting terminology you used above - "a sub select that returns all the 
child rows".   The SELECT that is the right hand target of the LEFT OUTER JOIN 
doesn't actually "return" any rows to your application - the outer one does.   
The subquery only represents "the set of all child rows".   

In relational theory, there is no difference between these two statements :

        select * from parent join child on parent.id = child.parent_id

        select * from parent join (select * from child) as c1 on 
parent.id=c1.parent_id

SQLAlchemy's normal behavior makes great usage of this natural effect.   The 
"set of all rows in 'child'" is the same as the "set of all rows in the set of 
all rows in 'child'".   A good SQL optimizer, like that of Postgresql or 
Oracle, knows this, and in fact with PG my experience is that it's extremely 
difficult to get it to come up with a different execution plan by moving 
subqueries out into joins and such - it almost always figures out the common 
relational structure underneath (however, if you're on PG, I can't say for sure 
without trying how well the above query would optimize).   SQLAlchemy's eager 
loader treats all "join targets" as the same thing - only the kind of 
selectable which comprises the "join target" changes.

Now, that's the theory part.   The practical part is, if you're using MySQL, a 
query like the above is going to hit you hard and mercilessly, as their 
optimizer is the worst I've ever seen.    

So my advice here depends strongly on which backend you're using.   To be 
blunt, if you're using MySQL, I would not be using joined table inheritance for 
anything critical, at all, period.   MySQL completely sucks at joins and it 
always will.    SQLAlchemy is heavily oriented around relational theory, 
whereas MySQL is the database where they didn't think foreign keys were really 
that important until ten years into the project.

If you're using Postgresql, I would be checking what EXPLAIN tells you - it may 
or may not make any difference if a WHERE clause were stuck inside the 
subquery.   I will say about PG that it works much better on inner joins than 
outer ones, though that's hard to use when you're trying to scan out on 
polymorphic tables like the above.

So lets say you're on MySQL or PG, you can't change your schema, you've 
observed that yes, the planner really is performing more poorly because the 
subquery isn't pre-limited (I really think thats only MySQL's quirk, though), 
what do you do ?   Well then you have to construct the JOIN by hand and use 
contains_eager():
        
        from sqlalchemy.orm import contains_eager

        subq = session.query(Child).filter(Child.parent_id==99999).subquery()

        query(Parent).outerjoin((subq, 
Parent.children)).filter(Parent.id=99999).options(contains_eager(Parent.children,
 alias=subq)).one()

so the above is an example of more explicit usage when the default 
relational-theory approach is not practical.   I haven't tested the snippet 
above but its built on patterns that are very well tested so hopefully should 
work out of the gate.    












> 
> Thanks!
> Anthony
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To post to this group, send email to sqlalch...@googlegroups.com.
> To unsubscribe from this group, send email to 
> sqlalchemy+unsubscr...@googlegroups.com.
> For more options, visit this group at 
> http://groups.google.com/group/sqlalchemy?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.

Reply via email to