On 08/12/2016 05:07 PM, Douglas Russell wrote:
Sorry to return again so soon, but I think I have uncovered a bug in the
subqueryload.

Continuing my example above. I had some with_polymorphic subqueries,
followed by a non-polymorphic subquery.

|
r = session.query(
    B
).options(
    subqueryload(
        B.cs.of_type(c_c2)
    ).subqueryload(
        c_c2.C2.ds
    )
).all()
|

In the simplest case I have:
B
C->C2
D

What I expected to get back from the query was all the Bs, with all the
C2s(including C) for those Bs, with all the Ds for those Cs.
Unfortunately what happens (for any significant query) is that the
database explodes and downs the machine by using all available memory
instead. More concisely, it does a query which results in: All the Bs,
with all the C2s(including C) for those Bs (good so far!) with all the
Ds for all Cs!

that's something pretty much called a "comma join" and is a very common symptom of issues in SQLAlchemy eager loading and querying due to the heavy emphasis on connecting things together into joins. when the connection fails, you get "FROM a JOIN b, b JOIN c" and cartesian products. Builds for the issue at https://bitbucket.org/zzzeek/sqlalchemy/issues/3773 are going through for 1.0 / 1.1 and will mark this as resolved once those are merged into the repo.




In SQL, here is what I would expect:

|
SELECT t_b.id AS t_b_id
FROM t_b

SELECT t_c_1.type AS t_c_1_type, t_c_1.id AS t_c_1_id, t_c_1.b_id AS
t_c_1_b_id, t_c2_1.id AS t_c2_1_id, anon_1.t_b_id AS anon_1_t_b_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN (t_c AS t_c_1 LEFT OUTER JOIN t_c2 AS t_c2_1 ON
t_c_1.id = t_c2_1.id) ON anon_1.t_b_id = t_c_1.b_id ORDER BY anon_1.t_b_id

SELECT t_d_1.id AS t_d_1_id, t_d_1.c_id AS t_d_1_c_id, t_c_1.id AS t_c_1_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN t_c AS t_c_1 ON anon_1.t_b_id = t_c_1.b_id JOIN
t_d AS t_d_1 ON t_c_1.id = t_d_1.c_id ORDER BY t_c_1.id
|

The actual SQL generated:

|
SELECT t_b.id AS t_b_id
FROM t_b

SELECT t_c_1.type AS t_c_1_type, t_c_1.id AS t_c_1_id, t_c_1.b_id AS
t_c_1_b_id, t_c2_1.id AS t_c2_1_id, anon_1.t_b_id AS anon_1_t_b_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN (t_c AS t_c_1 LEFT OUTER JOIN t_c2 AS t_c2_1 ON
t_c_1.id = t_c2_1.id) ON anon_1.t_b_id = t_c_1.b_id ORDER BY anon_1.t_b_id

SELECT t_d.id AS t_d_id, t_d.c_id AS t_d_c_id, t_c_1.id AS t_c_1_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN t_c AS t_c_1 ON anon_1.t_b_id = t_c_1.b_id, t_c
AS t_c_2 LEFT OUTER JOIN t_c2 AS t_c2_1 ON t_c_2.id = t_c2_1.id JOIN t_d
ON t_c_2.id = t_d.c_id ORDER BY t_c_1.id
|

I was able to work around this by doing:

|
r = session.query(
    B
).options(
    subqueryload(
        B.cs.of_type(c_c2)
    ).subqueryload(
        c_c2.C2.ds.of_type(D)
    )
).all()
|

but that didn't seem to make sense to me as there was no polymorphism at
work in the type D so why would it need to have of_type?

Even if that is necessary, I'd have expected that the query would have
failed or dropped that part of the chain (like I was seeing before with
parts that could not successfully chain).

Full worked example
here: https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a

My knowledge of how subqueryload is extremely limited so it's possible
I'm doing something wrong here, but having made this test case, what I
was doing seems logical so I figured it was maybe a bug.

Cheers,

Douglas

On Thursday, 11 August 2016 19:08:57 UTC-4, Douglas Russell wrote:

    Thanks for your help, Mike!

    Douglas

    On Thu, 11 Aug 2016 at 17:14 Mike Bayer <mike...@zzzcomputing.com
    <mailto:mike...@zzzcomputing.com>> wrote:



        On 08/11/2016 03:35 PM, Douglas Russell wrote:
        > Of course! That makes sense.
        >
        > Does SQLAlchemy just discard the subqueryload chain parts that
        are on
        > the wrong type? It didn't seem to generate any SQL for those
        when they
        > were wrong.

        it validates the path as far as classes / attributes, but I
        think when
        subclassing and aliasing is used it might be considering just
        the types,
        e.g. a B2 and a with_polymorphic(B, [B2]) are of compatible types.
        There's a lot of ways these paths get built up (like with string
        names
        and such), as well as all kinds of "it's worked this way for years"
        kinds of things that may not always be present in testing, so
        more error
        checking in there would have to be added carefully.




        >
        > An example of this in the docs would be a great idea I think
        as I think
        > it's pretty tough to explain this in the abstract as you say!
        >
        > Thanks again,
        >
        > Douglas
        >
        > On Thursday, 11 August 2016 11:51:39 UTC-4, Mike Bayer wrote:
        >
        >
        >
        >     On 08/11/2016 09:53 AM, Douglas Russell wrote:
        >     > of_type was what I was searching for. Thanks.
        >     >
        >     > Unfortunately, it did not yield the desired results when I
        >     extended the
        >     > hierarchy to 3 levels (C->C2 with a relation between B
        and C).
        >     >
        >     > I've updated the Gist.
        >     >
        >     > B+B2 is successfully queried, but C+C2 is not. This will
        then
        >     result in
        >     > lazily generated queries when accessing C2.
        >     >
        >     > In fact, it's actually a little more confusing even than
        that.
        >     With that
        >     > query, it does (in the original query)
        >     > Query for all A+A2
        >     > Query for all B+B2 (from the subqueryload)
        >     >
        >     > Then once I start to print the hierarchy out, lazy queries:
        >     > Query for all C
        >     > Query for each C2
        >
        >
        >     I'm not sure how to document this (I guess just add
        another example)
        >     but
        >     once you've done of_type(), that's now the type you're on
        and if you
        >     chain from there, it has to be in terms of that new type.  The
        >     with_polymorphic() object you've made is that type (note
        the subclass
        >     namespace that's on the object):
        >
        >     b_b2 = with_polymorphic(B, [B2], flat=True)
        >
        >     r = session.query(A2).options(
        >
        subqueryload(A2.bs.of_type(b_b2)).subqueryload(b_b2.B2.cs.of_type(
        >              with_polymorphic(C, [C2], flat=True))
        >          )
        >     ).all()
        >
        >
        >
        >
        >     >
        >     > Cheers,
        >     >
        >     > Douglas
        >     >
        >     > On Wednesday, 10 August 2016 18:18:27 UTC-4, Douglas
        Russell wrote:
        >     >
        >     >     Hi,
        >     >
        >     >     I have two (well, more than two with more than two
        levels
        >     each, but
        >     >     this is the simplest recreation of the problem) parallel
        >     >     joined-table inheritance hierarchies
        >     >
        >     >     A -> A2
        >     >     B -> B2
        >     >
        >     >     B.a is a relationship to A, backref in A.bs.
        >     >
        >     >     The problem arises when I want to query this whole
        structure back
        >     >     with subqueries. The desired number of queries in
        this case
        >     would be
        >     >     twoish, depending on how the polymorphic queries are
        done. At any
        >     >     rate, the desire is to remove any lazy loading at all.
        >     >
        >     >     The problem lies in formulating a subqueryload which
        will do
        >     this.
        >     >     My first thought was:
        >     >
        >     >     |
        >     >     session.query(A2).options(subqueryload(A2.bs)).all()
        >     >     |
        >     >
        >     >     The problem with this is that what I wanted was all
        the A2s
        >     and all
        >     >     the B2s, but what I get is all the A2s, and all the
        Bs. When I
        >     >     access a property of B2, it lazy loads that single
        record.
        >     >
        >     >     I think the best solution is something along the
        lines of using a
        >     >     combination of subqueryload and joinedload (or maybe
        >     >     with_polymorphic?) to say: Give me all the A2s with
        a subquery
        >     for
        >     >     all the B2s linked to it. I can't figure out how to
        do this as
        >     the
        >     >     arguments to subqueryload and joinedload are
        relationships
        >     which I
        >     >     don't currently have.
        >     >
        >     >     Minimal Test
        >     >     Case:
        >
         https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a
        <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a>
        >
         <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a
        <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a>>
        >     >
        >
         <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a
        <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a>
        >
         <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a
        <https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a>>>
        >     >
        >     >     Is there a clever way to handle this?
        >     >
        >     >     The only thing I've done which works is to add a
        relationship
        >     >     between B to B2:
        >     >
        >     >     |
        >     >     b = relationship('B', backref='b2')
        >     >     |
        >     >
        >     >     Then I can do:
        >     >
        >     >     |
        >     >     r =
        >     >
        >
         session.query(A2).options(subqueryload(A2.bs).joinedload(B.b2)).all()
        >     >     |
        >     >
        >     >     but it would be better if I didn't have to go and
        add these
        >     >     relationships to the existing models as there are
        several levels,
        >     >     repositories and (most hinderingly) jurisdictions in
        play in
        >     my real
        >     >     project. It seems likely that there is a clever way
        as SQLAlchemy
        >     >     does such a good job with the polymorphic queries
        usually.
        >     >
        >     >     Cheers,
        >     >
        >     >     Douglas
        >     >
        >     > --
        >     > You received this message because you are subscribed to
        the Google
        >     > Groups "sqlalchemy" group.
        >     > To unsubscribe from this group and stop receiving emails
        from it,
        >     send
        >     > an email to sqlalchemy+...@googlegroups.com <javascript:>
        >     > <mailto:sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com> <javascript:>>.
        >     > To post to this group, send email to
        sqlal...@googlegroups.com
        >     <javascript:>
        >     > <mailto:sqlal...@googlegroups.com <javascript:>>.
        >     > Visit this group at
        https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>
        >     <https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>>.
        >     > For more options, visit
        https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>
        >     <https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>>.
        >
        > --
        > You received this message because you are subscribed to the Google
        > Groups "sqlalchemy" group.
        > To unsubscribe from this group and stop receiving emails from
        it, send
        > an email to sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>
        > <mailto:sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>>.
        > To post to this group, send email to
        sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>
        > <mailto:sqlalchemy@googlegroups.com
        <mailto:sqlalchemy@googlegroups.com>>.
        > Visit this group at https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>.
        > For more options, visit https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>.

        --
        You received this message because you are subscribed to a topic
        in the Google Groups "sqlalchemy" group.
        To unsubscribe from this topic, visit
        https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe
        <https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe>.
        To unsubscribe from this group and all its topics, send an email
        to sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>.
        To post to this group, send email to sqlalchemy@googlegroups.com
        <mailto:sqlalchemy@googlegroups.com>.
        Visit this group at https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>.
        For more options, visit https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>.

--
You received this message because you are subscribed to the Google
Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to sqlalchemy+unsubscr...@googlegroups.com
<mailto:sqlalchemy+unsubscr...@googlegroups.com>.
To post to this group, send email to sqlalchemy@googlegroups.com
<mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to