On 05/02/2017 12:38 AM, Isaac Martin wrote:
Thank you very much for your response. This solution isn't working for me, and I'm 99% sure I'm not translating what you've written into something that works for my use case. My situation is somewhat more complicated than I initially wrote. My first question was in the vain hope that there was some easy way using declarative to tell anything loaded through a relationship() to load with a LIMIT. Obviously that isn't the case.

we can absolutely do that:


def message_join():
    subq = select([Messages.date]).\
        where(Messages.user_id == User.id).\
        order_by(Messages.date.desc()).\
        limit(1).offset(10).correlate(User).as_scalar()

    return and_(User.id == Messages.user_id, Messages.date > subq)


class User(Base):
    __tablename__ = 'user'
    id = Column(Integer, primary_key=True)
    messages = relationship(
        'Messages', order_by='desc(Messages.date)',
        primaryjoin=message_join)


class Messages(Base):
    __tablename__ = 'message'
    id = Column(Integer, primary_key=True)
    user_id = Column(ForeignKey('user.id'))
    date = Column(Date)



done, works great with joinedload(), lazy loading, etc:

q = s.query(User).options(joinedload(User.messages))




In my situation, I have a relationship which I have defined by hand. There are not foreign keys between the messages and the objects being loaded (Connections). When I try to use your lateral solution I get an error thrown that there are no foreign key relationships, which is true.

Here is my model edited down to what I believe are the minimal relevant fields. I'm trying to return Connections with a limited number of objects in the Connection.messages field:

/class Connection(Base):/
/    __tablename__ = 'Connection'/
/
/
/    user_1_id = Column(Integer, ForeignKey('User.uid'))/
/    user_2_id = Column(Integer, ForeignKey('User.uid'))/
/    messages =  relationship('Message',/
/ secondary="join(MessageRecipient, Message, MessageRecipient.message_id == Message.uid)",/ / primaryjoin= 'or_(Connection.user_1_id == MessageRecipient.recipient_id,'/ / 'Connection.user_2_id == MessageRecipient.recipient_id)',/
/                             order_by="Message.created.desc()")/
/
/
/class MessageRecipient(Base):/
/    __tablename__ = 'MessageRecipient'/
/
/
/    recipient_id = Column(Integer, ForeignKey('User.uid'))/
/
/
/class Message(Base):/
/    __tablename__ = 'Message'/
/
/
/    created = Column(DateTime, default=func.current_timestamp())/
/    body = Column(String(2000))
/
/
/

Now, I'm using Postgres, so I did attempt to translate your lateral example:

/        subq = Message.query.\/
/ filter(and_(Message.uid == MessageRecipient.message_id, or_(MessageRecipient.uid == Connection.user_1_id,/ / MessageRecipient.uid == Connection.user_2_id))).\/
/            order_by(Message.created.desc()).limit(1).subquery().lateral()/
/
/
/        q = Connection.query.outerjoin(subq). \/
/            options(contains_eager(Connection.messages, alias=subq))/
/
/
The above throws an error:
Can't find any foreign key relationships between 'Connection' and '%(140122530861688 anon)s'.

I also tried your non-lateral example
/        subq = self.db.session.query(Message.created). \/
/ filter(and_(Message.uid == MessageRecipient.message_id, or_(MessageRecipient.recipient_id == Connection.user_1_id,/ / MessageRecipient.recipient_id == Connection.user_2_id))).\/
/            order_by(Message.created.desc()). \/
/            limit(1).offset(10).correlate(Connection).as_scalar()/
/
/
/        q = self.db.session.query(Connection).join(/
/            MessageRecipient,/
/            or_(MessageRecipient.recipient_id == Connection.user_1_id,/
/                MessageRecipient.recipient_id == Connection.user_2_id)).\/
/            join( Message,/
/ and_(MessageRecipient.message_id == Message.uid, Message.created > subq)/
/        ).options(contains_eager(Connection.messages)).all()/
/
/
The above does not throw an error, but it also does not return any entities. I'm going to keep working on the non-lateral example because I don't understand your example well enough to confidently say I'm being faithful to it. If you have other advice based on this new information then I'd really appreciate it. If I figure out what I was doing wrong I'll post here for posterity.

On Mon, May 1, 2017 at 7:34 PM, mike bayer <mike...@zzzcomputing.com <mailto:mike...@zzzcomputing.com>> wrote:



    On 05/01/2017 08:05 PM, Isaac Martin wrote:


        I am building an api which can return children of resources if
        the user requests it. For example, |user| has |messages|. I want
        the query to be able to limit the number of |message| objects
        that are returned.

        I found a useful tip about limiting the number of objects in
        child collections here
        
<http://stackoverflow.com/questions/9148316/how-to-limit-offset-sqlalchemy-orm-relations-result?noredirect=1&lq=1
        
<http://stackoverflow.com/questions/9148316/how-to-limit-offset-sqlalchemy-orm-relations-result?noredirect=1&lq=1>>.
        Basically, it indicates the following flow:

        |classUser(...):# ...messages
        
=relationship('Messages',order_by='desc(Messages.date)',lazy='dynamic')user
        =User.query.one()users.messages.limit(10)|

        My use case involves returning sometimes large numbers of users.

        If I were to follow the advice in that link and used |.limit()|
        then I would need to iterate over the entire collection of users
        calling |.limit()| on each one. This is much less efficient
        then, say, using |LIMIT|in the original sql expression which
        created the collection.

        My question is whether it is possible using declarative to
        efficiently(N+0) load a large collection of objects while
        limiting the number of children in their child collections using
        sqlalchemy?


        To be clear, the below is what I am trying to /avoid/.

        |users =User.query.all()messages ={}foruser
        inusers:messages[user.id
        <http://user.id>]=user.messages.limit(10).all()|

        I want to do something more like:

        |users =User.query.option(User.messages.limit(10)).all()|


    so you should ignore whether or not it uses "declarative", which has
    nothing to do with querying, and in fact at first ignore Query too,
    because first and foremost this is a SQL problem.  You want one SQL
    statement that does this.  What query in SQL would load lots of rows
    from the primary table, joined to the first ten rows of the
    secondary table for each primary?

    LIMIT is tricky because it's not actually part of the usual
    "relational algebra" calculation.  It's outside of that because it's
    an artificial limit on rows.    For example, my first thought on how
    to do this was wrong:

             select * from users left outer join (select * from messages
    limit 10) as anon_1 on users.id <http://users.id> = anon_1.user_id

    This is wrong because it only gets the first ten messages in the
    aggregate, disregarding user.  We want to get the first ten messages
    for each user, which means we need to do this "select from messages
    limit 10" individually for each user. That is, we need to correlate
    somehow.  A correlated subquery though is not usually allowed as a
    FROM element, and is only allowed as a SQL expression, it can only
    return a single column and a single row; we can't normally JOIN to a
    correlated subquery in plain vanilla SQL.   We can however,
    correlate inside the ON clause of the JOIN to make this possible in
    vanilla SQL.

    But first, if we are on a modern Postgresql version, we *can* break
    that usual rule of correlation and use a keyword called LATERAL,
    which allows correlation in a FROM clause.  LATERAL is only
    supported by modern Postgresql versions, and it makes this easy:

             select * from users left outer join lateral
             (select * from message where message.user_id = users.id
    <http://users.id> order by messages.date desc limit 10) as anon1 on
    users.id <http://users.id> = anon_1.user_id

    we support the LATERAL keyword.   The query above looks like this:


         subq = s.query(Messages).\
             filter(Messages.user_id == User.id).\
             order_by(Messages.date.desc()).limit(10).subquery().lateral()

         q = s.query(User).outerjoin(subq).\
              options(contains_eager(User.messages, alias=subq))

    Note that above, in order to SELECT both users and messages and
    produce them into the User.messages collection, the
    "contains_eager()" option must be used and for that the "dynamic"
    has to go away.  This is not the only option, you can for example
    build a second relationship for User.messages that doesn't have the
    "dynamic" or you can just load from query(User, Message) separately
    and organize the result tuples as needed.

    if you aren't using Postgresql, or a version of Postgresql that
    doesn't support LATERAL, the correlation has to be worked into the
    ON clause of the join instead.   The SQL looks like:

             select * from users left outer join messages on
    users.id <http://users.id> = messages.user_id and messages.date >
    (select date from messages where messages.user_id = users.id
    <http://users.id> order by date desc limit 1 offset 10)

    Here, in order to jam the LIMIT in there, we are actually stepping
    through the first 10 rows with OFFSET and then doing LIMIT 1 to get
    the date that represents the lower bound date we want for each
    user.  Then we have to join while comparing on that date, which can
    be expensive if this column isn't indexed and also can be inaccurate
    if there are duplicate dates.

    This query looks like:

    subq = s.query(Messages.date).\
         filter(Messages.user_id == User.id).\
         order_by(Messages.date.desc()).\
         limit(1).offset(10).correlate(User).as_scalar()

    q = s.query(User).join(
         Messages,
         and_(User.id == Messages.user_id, Messages.date >= subq)
    ).options(contains_eager(User.messages))


    These kinds of queries are the kind that I don't trust without a
    good test, so POC below includes both versions including a sanity check.


    from sqlalchemy import *
    from sqlalchemy.orm import *
    from sqlalchemy.ext.declarative import declarative_base
    import datetime

    Base = declarative_base()


    class User(Base):
         __tablename__ = 'user'
         id = Column(Integer, primary_key=True)
         messages = relationship(
             'Messages', order_by='desc(Messages.date)')

    class Messages(Base):
         __tablename__ = 'message'
         id = Column(Integer, primary_key=True)
         user_id = Column(ForeignKey('user.id <http://user.id>'))
         date = Column(Date)

    e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
    Base.metadata.drop_all(e)
    Base.metadata.create_all(e)

    s = Session(e)

    s.add_all([
         User(id=i, messages=[
             Messages(id=(i * 20) + j, date=datetime.date(2017, 3, j))
             for j in range(1, 20)
         ]) for i in range(1, 51)
    ])

    s.commit()

    top_ten_dates = set(datetime.date(2017, 3, j) for j in range(10, 20))


    def run_test(q):
         all_u = q.all()
         assert len(all_u) == 50
         for u in all_u:

             messages = u.messages
             assert len(messages) == 10

             for m in messages:
                 assert m.user_id == u.id <http://u.id>

             received = set(m.date for m in messages)

             assert received == top_ten_dates

    # version 1.   no LATERAL

    s.close()

    subq = s.query(Messages.date).\
         filter(Messages.user_id == User.id).\
         order_by(Messages.date.desc()).\
         limit(1).offset(10).correlate(User).as_scalar()

    q = s.query(User).join(
         Messages,
         and_(User.id == Messages.user_id, Messages.date > subq)
    ).options(contains_eager(User.messages))

    run_test(q)

    # version 2.  LATERAL

    s.close()

    subq = s.query(Messages).\
         filter(Messages.user_id == User.id).\
         order_by(Messages.date.desc()).limit(10).subquery().lateral()

    q = s.query(User).outerjoin(subq).\
         options(contains_eager(User.messages, alias=subq))

    run_test(q)






        SO post here:
        
http://stackoverflow.com/questions/43727268/limit-child-collections-in-initial-query-sqlalchemy/43727705?noredirect=1#comment74499641_43727705
        
<http://stackoverflow.com/questions/43727268/limit-child-collections-in-initial-query-sqlalchemy/43727705?noredirect=1#comment74499641_43727705>

-- SQLAlchemy -
        The Python SQL Toolkit and Object Relational Mapper

        http://www.sqlalchemy.org/

        To post example code, please provide an MCVE: Minimal, Complete,
        and Verifiable Example. See http://stackoverflow.com/help/mcve
        <http://stackoverflow.com/help/mcve> for a full description.
        ---
        You received this message because you are subscribed to the
        Google Groups "sqlalchemy" group.
        To unsubscribe from this group and stop receiving emails from
        it, send an email to sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>
        <mailto:sqlalchemy+unsubscr...@googlegroups.com
        <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>>.
        To post to this group, send email to sqlalchemy@googlegroups.com
        <mailto:sqlalchemy@googlegroups.com>
        <mailto:sqlalchemy@googlegroups.com
        <mailto:sqlalchemy@googlegroups.com>>.
        Visit this group at https://groups.google.com/group/sqlalchemy
        <https://groups.google.com/group/sqlalchemy>.
        For more options, visit https://groups.google.com/d/optout
        <https://groups.google.com/d/optout>.


-- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper

    http://www.sqlalchemy.org/

    To post example code, please provide an MCVE: Minimal, Complete, and
    Verifiable Example.  See http://stackoverflow.com/help/mcve
    <http://stackoverflow.com/help/mcve> for a full description.
    --- You received this message because you are subscribed to a topic
    in the Google Groups "sqlalchemy" group.
    To unsubscribe from this topic, visit
    https://groups.google.com/d/topic/sqlalchemy/lOedjqu61G4/unsubscribe
    <https://groups.google.com/d/topic/sqlalchemy/lOedjqu61G4/unsubscribe>.
    To unsubscribe from this group and all its topics, send an email to
    sqlalchemy+unsubscr...@googlegroups.com
    <mailto:sqlalchemy%2bunsubscr...@googlegroups.com>.
    To post to this group, send email to sqlalchemy@googlegroups.com
    <mailto:sqlalchemy@googlegroups.com>.
    Visit this group at https://groups.google.com/group/sqlalchemy
    <https://groups.google.com/group/sqlalchemy>.
    For more options, visit https://groups.google.com/d/optout
    <https://groups.google.com/d/optout>.


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com <mailto:sqlalchemy+unsubscr...@googlegroups.com>. To post to this group, send email to sqlalchemy@googlegroups.com <mailto:sqlalchemy@googlegroups.com>.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

--
SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to