Well that was a very silly mistake. End of the day oversight is what
happened here. I didn't notice that "offset" was set to 10 and my unit test
only had 2 values. When I set offset to 0 it worked great.

Sincerest thanks for this! I hadn't seen as_scalar used before. I'm looking
forward to trying subqueries like this myself in the future.

On Mon, May 1, 2017 at 9:38 PM, Isaac Martin <melchoi...@gmail.com> wrote:

> Thank you very much for your response. This solution isn't working for me,
> and I'm 99% sure I'm not translating what you've written into something
> that works for my use case. My situation is somewhat more complicated than
> I initially wrote. My first question was in the vain hope that there was
> some easy way using declarative to tell anything loaded through a
> relationship() to load with a LIMIT. Obviously that isn't the case.
>
> In my situation, I have a relationship which I have defined by hand. There
> are not foreign keys between the messages and the objects being loaded
> (Connections). When I try to use your lateral solution I get an error
> thrown that there are no foreign key relationships, which is true.
>
> Here is my model edited down to what I believe are the minimal relevant
> fields. I'm trying to return Connections with a limited number of objects
> in the Connection.messages field:
>
> *class Connection(Base):*
> *    __tablename__ = 'Connection'*
>
> *    user_1_id = Column(Integer, ForeignKey('User.uid'))*
> *    user_2_id = Column(Integer, ForeignKey('User.uid'))*
> *    messages =  relationship('Message',*
> *                             secondary="join(MessageRecipient, Message,
> MessageRecipient.message_id == Message.uid)",*
> *                             primaryjoin= 'or_(Connection.user_1_id ==
> MessageRecipient.recipient_id,'*
> *                                           'Connection.user_2_id ==
> MessageRecipient.recipient_id)',*
> *                             order_by="Message.created.desc()")*
>
> *class MessageRecipient(Base):*
> *    __tablename__ = 'MessageRecipient'*
>
> *    recipient_id = Column(Integer, ForeignKey('User.uid'))*
>
> *class Message(Base):*
> *    __tablename__ = 'Message'*
>
> *    created = Column(DateTime, default=func.current_timestamp())*
>
> *    body = Column(String(2000))*
>
>
> Now, I'm using Postgres, so I did attempt to translate your lateral
> example:
>
> *        subq = Message.query.\*
> *            filter(and_(Message.uid == MessageRecipient.message_id,
> or_(MessageRecipient.uid == Connection.user_1_id,*
> *                                            MessageRecipient.uid ==
> Connection.user_2_id))).\*
> *
> order_by(Message.created.desc()).limit(1).subquery().lateral()*
>
> *        q = Connection.query.outerjoin(subq). \*
> *            options(contains_eager(Connection.messages, alias=subq))*
>
> The above throws an error:
> Can't find any foreign key relationships between 'Connection' and
> '%(140122530861688 anon)s'.
>
> I also tried your non-lateral example
> *        subq = self.db.session.query(Message.created). \*
> *            filter(and_(Message.uid == MessageRecipient.message_id,
> or_(MessageRecipient.recipient_id == Connection.user_1_id,*
> *                            MessageRecipient.recipient_id ==
> Connection.user_2_id))).\*
> *            order_by(Message.created.desc()). \*
> *            limit(1).offset(10).correlate(Connection).as_scalar()*
>
> *        q = self.db.session.query(Connection).join(*
> *            MessageRecipient,*
> *            or_(MessageRecipient.recipient_id == Connection.user_1_id,*
> *                MessageRecipient.recipient_id == Connection.user_2_id)).\*
> *            join( Message,*
> *            and_(MessageRecipient.message_id == Message.uid,
> Message.created > subq)*
> *        ).options(contains_eager(Connection.messages)).all()*
>
> The above does not throw an error, but it also does not return any
> entities. I'm going to keep working on the non-lateral example because I
> don't understand your example well enough to confidently say I'm being
> faithful to it. If you have other advice based on this new information then
> I'd really appreciate it.  If I figure out what I was doing wrong I'll post
> here for posterity.
>
> On Mon, May 1, 2017 at 7:34 PM, mike bayer <mike...@zzzcomputing.com>
> wrote:
>
>>
>>
>> On 05/01/2017 08:05 PM, Isaac Martin wrote:
>>
>>>
>>> I am building an api which can return children of resources if the user
>>> requests it. For example, |user| has |messages|. I want the query to be
>>> able to limit the number of |message| objects that are returned.
>>>
>>> I found a useful tip about limiting the number of objects in child
>>> collections here <http://stackoverflow.com/ques
>>> tions/9148316/how-to-limit-offset-sqlalchemy-orm-relations-
>>> result?noredirect=1&lq=1>. Basically, it indicates the following flow:
>>>
>>> |classUser(...):# ...messages =relationship('Messages',order
>>> _by='desc(Messages.date)',lazy='dynamic')user
>>> =User.query.one()users.messages.limit(10)|
>>>
>>> My use case involves returning sometimes large numbers of users.
>>>
>>> If I were to follow the advice in that link and used |.limit()| then I
>>> would need to iterate over the entire collection of users calling
>>> |.limit()| on each one. This is much less efficient then, say, using
>>> |LIMIT|in the original sql expression which created the collection.
>>>
>>> My question is whether it is possible using declarative to
>>> efficiently(N+0) load a large collection of objects while limiting the
>>> number of children in their child collections using sqlalchemy?
>>>
>>>
>>> To be clear, the below is what I am trying to /avoid/.
>>>
>>> |users =User.query.all()messages ={}foruser inusers:messages[user.id
>>> ]=user.messages.limit(10).all()|
>>>
>>> I want to do something more like:
>>>
>>> |users =User.query.option(User.messages.limit(10)).all()|
>>>
>>
>> so you should ignore whether or not it uses "declarative", which has
>> nothing to do with querying, and in fact at first ignore Query too, because
>> first and foremost this is a SQL problem.  You want one SQL statement that
>> does this.  What query in SQL would load lots of rows from the primary
>> table, joined to the first ten rows of the secondary table for each primary?
>>
>> LIMIT is tricky because it's not actually part of the usual "relational
>> algebra" calculation.  It's outside of that because it's an artificial
>> limit on rows.    For example, my first thought on how to do this was wrong:
>>
>>         select * from users left outer join (select * from messages limit
>> 10) as anon_1 on users.id = anon_1.user_id
>>
>> This is wrong because it only gets the first ten messages in the
>> aggregate, disregarding user.  We want to get the first ten messages for
>> each user, which means we need to do this "select from messages limit 10"
>> individually for each user. That is, we need to correlate somehow.  A
>> correlated subquery though is not usually allowed as a FROM element, and is
>> only allowed as a SQL expression, it can only return a single column and a
>> single row; we can't normally JOIN to a correlated subquery in plain
>> vanilla SQL.   We can however, correlate inside the ON clause of the JOIN
>> to make this possible in vanilla SQL.
>>
>> But first, if we are on a modern Postgresql version, we *can* break that
>> usual rule of correlation and use a keyword called LATERAL, which allows
>> correlation in a FROM clause.  LATERAL is only supported by modern
>> Postgresql versions, and it makes this easy:
>>
>>         select * from users left outer join lateral
>>         (select * from message where message.user_id = users.id order by
>> messages.date desc limit 10) as anon1 on users.id = anon_1.user_id
>>
>> we support the LATERAL keyword.   The query above looks like this:
>>
>>
>>     subq = s.query(Messages).\
>>         filter(Messages.user_id == User.id).\
>>         order_by(Messages.date.desc()).limit(10).subquery().lateral()
>>
>>     q = s.query(User).outerjoin(subq).\
>>          options(contains_eager(User.messages, alias=subq))
>>
>> Note that above, in order to SELECT both users and messages and produce
>> them into the User.messages collection, the "contains_eager()" option must
>> be used and for that the "dynamic" has to go away.  This is not the only
>> option, you can for example build a second relationship for User.messages
>> that doesn't have the "dynamic" or you can just load from query(User,
>> Message) separately and organize the result tuples as needed.
>>
>> if you aren't using Postgresql, or a version of Postgresql that doesn't
>> support LATERAL, the correlation has to be worked into the ON clause of the
>> join instead.   The SQL looks like:
>>
>>         select * from users left outer join messages on
>>         users.id = messages.user_id and messages.date > (select date
>> from messages where messages.user_id = users.id order by date desc limit
>> 1 offset 10)
>>
>> Here, in order to jam the LIMIT in there, we are actually stepping
>> through the first 10 rows with OFFSET and then doing LIMIT 1 to get the
>> date that represents the lower bound date we want for each user.  Then we
>> have to join while comparing on that date, which can be expensive if this
>> column isn't indexed and also can be inaccurate if there are duplicate
>> dates.
>>
>> This query looks like:
>>
>> subq = s.query(Messages.date).\
>>     filter(Messages.user_id == User.id).\
>>     order_by(Messages.date.desc()).\
>>     limit(1).offset(10).correlate(User).as_scalar()
>>
>> q = s.query(User).join(
>>     Messages,
>>     and_(User.id == Messages.user_id, Messages.date >= subq)
>> ).options(contains_eager(User.messages))
>>
>>
>> These kinds of queries are the kind that I don't trust without a good
>> test, so POC below includes both versions including a sanity check.
>>
>>
>> from sqlalchemy import *
>> from sqlalchemy.orm import *
>> from sqlalchemy.ext.declarative import declarative_base
>> import datetime
>>
>> Base = declarative_base()
>>
>>
>> class User(Base):
>>     __tablename__ = 'user'
>>     id = Column(Integer, primary_key=True)
>>     messages = relationship(
>>         'Messages', order_by='desc(Messages.date)')
>>
>> class Messages(Base):
>>     __tablename__ = 'message'
>>     id = Column(Integer, primary_key=True)
>>     user_id = Column(ForeignKey('user.id'))
>>     date = Column(Date)
>>
>> e = create_engine("postgresql://scott:tiger@localhost/test", echo=True)
>> Base.metadata.drop_all(e)
>> Base.metadata.create_all(e)
>>
>> s = Session(e)
>>
>> s.add_all([
>>     User(id=i, messages=[
>>         Messages(id=(i * 20) + j, date=datetime.date(2017, 3, j))
>>         for j in range(1, 20)
>>     ]) for i in range(1, 51)
>> ])
>>
>> s.commit()
>>
>> top_ten_dates = set(datetime.date(2017, 3, j) for j in range(10, 20))
>>
>>
>> def run_test(q):
>>     all_u = q.all()
>>     assert len(all_u) == 50
>>     for u in all_u:
>>
>>         messages = u.messages
>>         assert len(messages) == 10
>>
>>         for m in messages:
>>             assert m.user_id == u.id
>>
>>         received = set(m.date for m in messages)
>>
>>         assert received == top_ten_dates
>>
>> # version 1.   no LATERAL
>>
>> s.close()
>>
>> subq = s.query(Messages.date).\
>>     filter(Messages.user_id == User.id).\
>>     order_by(Messages.date.desc()).\
>>     limit(1).offset(10).correlate(User).as_scalar()
>>
>> q = s.query(User).join(
>>     Messages,
>>     and_(User.id == Messages.user_id, Messages.date > subq)
>> ).options(contains_eager(User.messages))
>>
>> run_test(q)
>>
>> # version 2.  LATERAL
>>
>> s.close()
>>
>> subq = s.query(Messages).\
>>     filter(Messages.user_id == User.id).\
>>     order_by(Messages.date.desc()).limit(10).subquery().lateral()
>>
>> q = s.query(User).outerjoin(subq).\
>>     options(contains_eager(User.messages, alias=subq))
>>
>> run_test(q)
>>
>>
>>
>>
>>
>>>
>>> SO post here: http://stackoverflow.com/quest
>>> ions/43727268/limit-child-collections-in-initial-query-sqlal
>>> chemy/43727705?noredirect=1#comment74499641_43727705
>>>
>>> --
>>> SQLAlchemy -
>>> The Python SQL Toolkit and Object Relational Mapper
>>>
>>> http://www.sqlalchemy.org/
>>>
>>> To post example code, please provide an MCVE: Minimal, Complete, and
>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
>>> description.
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to sqlalchemy+unsubscr...@googlegroups.com <mailto:
>>> sqlalchemy+unsubscr...@googlegroups.com>.
>>> To post to this group, send email to sqlalchemy@googlegroups.com
>>> <mailto:sqlalchemy@googlegroups.com>.
>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper
>>
>> http://www.sqlalchemy.org/
>>
>> To post example code, please provide an MCVE: Minimal, Complete, and
>> Verifiable Example.  See  http://stackoverflow.com/help/mcve for a full
>> description.
>> --- You received this message because you are subscribed to a topic in
>> the Google Groups "sqlalchemy" group.
>> To unsubscribe from this topic, visit https://groups.google.com/d/to
>> pic/sqlalchemy/lOedjqu61G4/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> sqlalchemy+unsubscr...@googlegroups.com.
>> To post to this group, send email to sqlalchemy@googlegroups.com.
>> Visit this group at https://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to