[sqlalchemy] Re: Polymorphic and inheritance with missing children

Richard Levasseur Mon, 11 Feb 2008 19:26:03 -0800

On Feb 8, 2008 2:27 PM, Michael Bayer <[EMAIL PROTECTED]> wrote:

>
>
> On Feb 8, 2008, at 3:49 PM, Richard Levasseur wrote:
>
> >
> > Hm, we could do that, but that means that we have to outerjoin to
> > ~15 other tables.  The primary table has ~200+ columns on it, each
> > child table has 10-20 columns, there are over a million records, and
> > our base filter criteria can be a bit complex.  Its indexed and
> > such, but the query needs to run in < 0.7 seconds (about how fast it
> > goes on dev).  I have concerns it won't perform as well (we tried
> > that and it really killed performance, but that was prior to a lot
> > of performance improvements we've made, so it might be feasible
> > again).
>
> you don't have to join to 15 tables.  You can also set select_table to
> the base table only, then write your mapper extension to do the work
> of _post_instance().  This extension calls the main populate_instance
> first, then issues a second query for the child table which is
> conditional.  This example is "hardcoded" to the example "kids" table
> but can be generalized if needed.  Notice that it populates the "kid"
> attributes with some default values, which is needed here because
> otherwise hitting them later will trigger a "broken" load (because
> there's no row in "kids"):



Ok, so I tried this.  It works fine, the only catch is that yeah, the pid
and cid are hardcoded in there.  I couldn't figure out any way to determine
them programatically.  I can get the join condition clause, but its just a
binary expression object (essentially saying "pid = cid"), and I don't know
how to extract the portions reliably (from the sounds of things, it doesn't
sound possible at all).  It looks like cid is always on the right, and pid
is always on the left, but I'm guessing thats just deterministic chance, and
wouldn't know what to do when its a more complicated expression.  This
worries me because I know there are other tables that will require more
complicated join conditions (where deleted/archived/hidden == 0 or IS NULL,
or some such thing)

It looks like the nested post_execute def has a reference to `statement`
that it uses to figure out the join condition and issue the subquery.  If I
could simply call that inside my extension's populate_instance and handle
the exception, that'd probably work.



> >
> > Thinking about this more, I essentially want to eagerload certain
> > inherited child tables on a case-by-case basis.  I think I've seen
> > similar questions about that here already.  I don't know the
> > internals, but it seems like a parent knows about its children
> > through its polymorphic map, so can't it figure out all the tables
> > it would have to join to on its own?
>
> if by "eagerload" you mean, "issue a second query for", then the above
> approach will get you started.  if you mean that it should construct
> joins from parent to child table, well yes it already does that if you
> query for a Child specifically, i.e. query(Child).all().  you also
> said you dont want to use a join for the base mapper since theres too
> many tables, so i dont think that's where you're referring to.   You
> can put select_table on whichever mappers you want to control who
> loads from what kinds of tables.



Ok, let me give a more practical example:

Lets say we have the following schema:
Persons(pid, etype, name, is_active)
Managers(pid, mid, level, full_team)
Engineers(pid, eid, language)

Lets say there's a single search on the webpage and users can enter in
queries like:
1) engineer.language:Java OR manager.level:5
2) name:john
3) engineer.language: python

the psuedo-sql for those queries should be something like
1) select * from persons left join engineers on pid=pid left join managers
on pid=pid and full_team=0 where engineers.language="Java" or
managers.level=5
2) select * from persons where name ='john''
3) select * from persons left join engineers using pid where
engineers.language='python'

Note that for (1), it has the additional full_team=0 condition as part of
the join itself.
So, depending on the user's query, we need to join to different tables.

What I meant by eagerloading: It may or may not need to select columns from
those tables depending on the user's view (perhaps they have the '
engineer.language' field turned on, so we'd always join to Engineers that
regardless to avoid issuing lots of sub-queries).

For the api, it'd be nice to do something like:
session.query(Person).filter(Manager.level=5).filter(Person.is_active==1)
And the orm uses the join conditions we defined elsewhere (probably on the
mapper?).

Right now, it'll join to the tables it needs, but it won't put in the join
conditions.  I know its in there somewhere, otherwise it couldn't do the
subquery (...right?).  Another catch I'm seeing is how to define those
additional join conditions (use select_table with a custom condition?).  It
looks like its magically picking up the foreign key references between
Managers/Engineers and Persons....just wish I knew how to tell it that
explicitly (since some things have additional join conditions).

Would it be better to abandon the orm portion of this, and use select()
instead?  I'd really like to avoid that since thats essentially what we're
doing now in php, and it works, but its messy at all levels.


>
> If you want to send through certain
> selectables on a per-query basis that is doable as well. I can't give
> more advice without specifically seeing what you mean by a "case-by-
> case" basis.


How would I do that?  This is to solve the case-by-case base query problem:
base_query = filter.query(Person).filter(Person.security_groups.in_(
user.get_groups()))
base_query.filter(Manager.level=5).filter(...etc...).  I think I can use the
select_from parameter...but then I have to pass that in every time I run a
query.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~----------~----~----~----~------~----~------~--~---

[sqlalchemy] Re: Polymorphic and inheritance with missing children

Reply via email to