On Jan 22, 2014, at 11:54 AM, Simon King <si...@simonking.org.uk> wrote:

> Hi all,
> 
> I've been having a little trouble configuring a relationship between 2
> mapped classes where the join condition pulls in another 2 tables. I
> eventually got it working based on Mike's "method one" from
> http://stackoverflow.com/questions/17580649/sqlalchemy-relationships-across-multiple-tables,
> by adding foreign() and remote() around a couple of columns involved
> in the join.
> 
> However, I don't really understand what I'm doing, and probably as a
> result of that, the relationship doesn't work in conjunction with
> joinedload (one of the tables is missing from the FROM clause).

I’m actually not even sure the example I put in that stack overflow answer 
would work with joinedload().   joinedload() is designed to think of joining 
between two classes as a join between two tables, or if secondary is present, 
between three tables, and it is hardwired to do this by rendering a LEFT OUTER 
JOIN, which requires a specific table (or sub selectable) on each side.

When the primaryjoin mentions columns from a third (or fourth, etc.) table, 
lots of cases work when SQLAlchemy is rendering an “implicit join”, e.g. no 
JOIN keyword.   The lazyload is one such case; lazyload doesn’t even render the 
parent table, it only renders the target table in the SELECT.  if its WHERE 
clause happens to refer to some other table, that table gets pulled into the 
FROM clause, as the select() construct pulls everything from the WHERE clause 
into the FROM.

Within a join, that doesn’t work.  If you do a select() like this:

   select([a]).select_from(a.join(b, a.c.id == b.c.id)).where(b.c.c_id == 
c.c.id)

you’ll get the almost always not wanted SQL:
  
  SELECT a.* FROM c, a JOIN b ON a.id = b.id WHERE b.c_id = c.id

where above we have “c, a JOIN b”, that’s the common failure case here.  The 
“c” isn’t part of the join.

So really, “method one” is not a great answer here.  Some version of either 
“method two” or “method three”, that is a non-primary mapper() to the 
destination tables, will be more resilient to JOIN use cases.

This whole subject area is a big missing hole in the docs.   The use cases are 
complicated and we really don’t want to encourage long-chained primaryjoins in 
any case, unless they’re really needed.


> I can work up a proper example script, but before that I'd really like
> to understand the effect that foreign() and remote() have when
> constructing the relationship. The comment in the SO post says:
> 
> # B.id is "remote", well at the moment this is kind of
> # where the ORM wants it, it sort of means "this is where the stuff
> # starts that's not directly part of the A side"
> 
> In my situation, I've got tables A, B, C and D, with foreign keys:
> 
> A->B
> B->D
> C->A
> C->D
> 
> and then extra constraints between B and C, and A and D, and I'm
> trying to build a relationship from A to D. In all that, I've no real
> idea where to put foreign() or remote().

so really foreign() and remote() isn’t going to help with the join/joinedload 
use case, as there’s no system whereby joinedload() knows to render "a JOIN b 
JOIN c JOIN d” and such.  foreign() and remote() are just trying to tell the 
relationship which columns are part of which side, and in which direction data 
should be copied during a flush (e.g. when we flush user->addresses, we are 
always copying user.id -> address.user_id, that is, PK -> FK - the foreign() 
annotation is a way of controlling which column in “user.id == address.user_id” 
is the “FK”).

Going forward, the a->secondary->b technique, as well as the a-><non primary 
mapper of b, c, d, e…> technique, I am hoping will become more feasible, now 
that the ORM can nest JOINs together without needing to use a SELECT subquery.

However, trying out your use case the way I’d like, doesn’t work yet, which is 
that I’d like to set up “secondary” just as a join() object.   In 0.9.1, you 
still get a SELECT subquery for join() and joinedload(), though you’ll get the 
right results; but lazy loads are failing.   I can fix both issues but that 
would be in 0.9.2.

In 0.9.1, the following setup seems to work so far, but it requires a SELECT 
subquery for the “secondary” table to work fully, which is the join of B and C:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class A(Base):
    __tablename__ = 'a'

    id = Column(Integer, primary_key=True)
    b_id = Column(ForeignKey('b.id'))


class B(Base):
    __tablename__ = 'b'

    id = Column(Integer, primary_key=True)
    d_id = Column(ForeignKey('d.id'))

class C(Base):
    __tablename__ = 'c'

    id = Column(Integer, primary_key=True)
    a_id = Column(ForeignKey('a.id'))
    d_id = Column(ForeignKey('d.id'))

class D(Base):
    __tablename__ = 'd'

    id = Column(Integer, primary_key=True)

a = A.__table__
b = B.__table__
d = D.__table__
c = C.__table__

j = join(b, d, b.c.d_id == d.c.id).join(c, c.c.d_id == d.c.id).alias()
A.d = relationship("D",
            secondary=j,
            primaryjoin=and_(a.c.b_id == j.c.b_id, a.c.id == j.c.c_a_id),
            secondaryjoin=d.c.id == j.c.b_d_id)


e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
Base.metadata.drop_all(e)
Base.metadata.create_all(e)

sess = Session(e)

sess.query(A).options(joinedload(A.d)).all()
sess.query(A).join(A.d).all()



Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to