Re: [sqlalchemy] Relationship with complicated join conditions

Simon King Wed, 22 Jan 2014 16:18:09 -0800

> On 22 Jan 2014, at 23:45, Michael Bayer <mike...@zzzcomputing.com> wrote:
>
>
>> On Jan 22, 2014, at 11:54 AM, Simon King <si...@simonking.org.uk> wrote:
>>
>> Hi all,
>>
>> I've been having a little trouble configuring a relationship between 2
>> mapped classes where the join condition pulls in another 2 tables. I
>> eventually got it working based on Mike's "method one" from
>> http://stackoverflow.com/questions/17580649/sqlalchemy-relationships-across-multiple-tables,
>> by adding foreign() and remote() around a couple of columns involved
>> in the join.
>>
>> However, I don't really understand what I'm doing, and probably as a
>> result of that, the relationship doesn't work in conjunction with
>> joinedload (one of the tables is missing from the FROM clause).
>
> I'm actually not even sure the example I put in that stack overflow answer 
> would work with joinedload().   joinedload() is designed to think of joining 
> between two classes as a join between two tables, or if secondary is present, 
> between three tables, and it is hardwired to do this by rendering a LEFT 
> OUTER JOIN, which requires a specific table (or sub selectable) on each side.
>
> When the primaryjoin mentions columns from a third (or fourth, etc.) table, 
> lots of cases work when SQLAlchemy is rendering an "implicit join", e.g. no 
> JOIN keyword.   The lazyload is one such case; lazyload doesn't even render 
> the parent table, it only renders the target table in the SELECT.  if its 
> WHERE clause happens to refer to some other table, that table gets pulled 
> into the FROM clause, as the select() construct pulls everything from the 
> WHERE clause into the FROM.
>
> Within a join, that doesn't work.  If you do a select() like this:
>
>   select([a]).select_from(a.join(b, a.c.id == b.c.id)).where(b.c.c_id == 
> c.c.id)
>
> you'll get the almost always not wanted SQL:
>
>  SELECT a.* FROM c, a JOIN b ON a.id = b.id WHERE b.c_id = c.id
>
> where above we have "c, a JOIN b", that's the common failure case here.  The 
> "c" isn't part of the join.
>


Ah, OK, that certainly explains some of the behaviour I was seeing.


> So really, "method one" is not a great answer here.  Some version of either 
> "method two" or "method three", that is a non-primary mapper() to the 
> destination tables, will be more resilient to JOIN use cases.
>

I read the bit in the docs about non-primary mappers but was scared
off by the "almost never needed" warnings. Actually, for the purposes
I'm using it for, a completely separate class mapped to the select
would probably be fine.


> This whole subject area is a big missing hole in the docs.   The use cases 
> are complicated and we really don't want to encourage long-chained 
> primaryjoins in any case, unless they're really needed.
>
>
>> I can work up a proper example script, but before that I'd really like
>> to understand the effect that foreign() and remote() have when
>> constructing the relationship. The comment in the SO post says:
>>
>> # B.id is "remote", well at the moment this is kind of
>> # where the ORM wants it, it sort of means "this is where the stuff
>> # starts that's not directly part of the A side"
>>
>> In my situation, I've got tables A, B, C and D, with foreign keys:
>>
>> A->B
>> B->D
>> C->A
>> C->D
>>
>> and then extra constraints between B and C, and A and D, and I'm
>> trying to build a relationship from A to D. In all that, I've no real
>> idea where to put foreign() or remote().
>
> so really foreign() and remote() isn't going to help with the join/joinedload 
> use case, as there's no system whereby joinedload() knows to render "a JOIN b 
> JOIN c JOIN d" and such.  foreign() and remote() are just trying to tell the 
> relationship which columns are part of which side, and in which direction 
> data should be copied during a flush (e.g. when we flush user->addresses, we 
> are always copying user.id -> address.user_id, that is, PK -> FK - the 
> foreign() annotation is a way of controlling which column in "user.id == 
> address.user_id" is the "FK").
>

Ok, thanks for the explanation.


> Going forward, the a->secondary->b technique, as well as the a-><non primary 
> mapper of b, c, d, e...> technique, I am hoping will become more feasible, 
> now that the ORM can nest JOINs together without needing to use a SELECT 
> subquery.
>
> However, trying out your use case the way I'd like, doesn't work yet, which 
> is that I'd like to set up "secondary" just as a join() object.   In 0.9.1, 
> you still get a SELECT subquery for join() and joinedload(), though you'll 
> get the right results; but lazy loads are failing.   I can fix both issues 
> but that would be in 0.9.2.

> In 0.9.1, the following setup seems to work so far, but it requires a SELECT 
> subquery for the "secondary" table to work fully, which is the join of B and 
> C:
>
> from sqlalchemy import *
> from sqlalchemy.orm import *
> from sqlalchemy.ext.declarative import declarative_base
>
> Base = declarative_base()
>
> class A(Base):
>    __tablename__ = 'a'
>
>    id = Column(Integer, primary_key=True)
>    b_id = Column(ForeignKey('b.id'))
>
>
> class B(Base):
>    __tablename__ = 'b'
>
>    id = Column(Integer, primary_key=True)
>    d_id = Column(ForeignKey('d.id'))
>
> class C(Base):
>    __tablename__ = 'c'
>
>    id = Column(Integer, primary_key=True)
>    a_id = Column(ForeignKey('a.id'))
>    d_id = Column(ForeignKey('d.id'))
>
> class D(Base):
>    __tablename__ = 'd'
>
>    id = Column(Integer, primary_key=True)
>
> a = A.__table__
> b = B.__table__
> d = D.__table__
> c = C.__table__
>
> j = join(b, d, b.c.d_id == d.c.id).join(c, c.c.d_id == d.c.id).alias()
> A.d = relationship("D",
>            secondary=j,
>            primaryjoin=and_(a.c.b_id == j.c.b_id, a.c.id == j.c.c_a_id),
>            secondaryjoin=d.c.id == j.c.b_d_id)
>
>
> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
> Base.metadata.drop_all(e)
> Base.metadata.create_all(e)
>
> sess = Session(e)
>
> sess.query(A).options(joinedload(A.d)).all()
> sess.query(A).join(A.d).all()

Mike, thanks as always for the very comprehensive answer. I'm
currently on 0.8 but this seems like a good incentive to upgrade. I'll
try it out tomorrow and will let you know how I get on.

Thanks again,

Simon

-- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [sqlalchemy] Relationship with complicated join conditions

Reply via email to