Re: [sqlalchemy] How to use load_only or with_entities with relationships

Mike Bayer Thu, 10 Jan 2019 07:19:14 -0800

On Thu, Jan 10, 2019 at 8:34 AM Gmoney <grgg...@gmail.com> wrote:
>
> Thanks for the very clear reply, that helps understand the underpinnings a 
> bit more.  Hope you didn't take the list of errors as a complaint - just a 
> list of what I tried... but the error message enhancements look like  a good 
> addition that will help when I forget all this and repeat it a year from now.


the load_only() stuff is recent and I absolutely felt your
frustration, I very often have very similar sequences where I'm trying
to get the ORM to do something intuitive and every single thing I try
doesn't do it - this is because after 13 years of development I don't
have a memory of every use case either, so finding these patches of
looks intuitive and raises an inscrutable error message are bugs that
need to be fixed.


>
> As you said, I was relying on the rendered SQL to debug, so that part makes 
> sense now too, and It is indeed working with the defaultload.
>
> The part that still confuses me a bit is that if do the default and select 
> all fields (including relationships) and don't do any specific deferrals, the 
> ORM knows what to do for joins etc - a simple Task.query() works.

this may be because you have lazy="joined" on your relationship, it
will automatically fetch Task.comments in the same query.

> Is there something about the deferral/downselecting that breaks that ability 
> and requires the more involved syntax specifying the secondary joins/loads 
> for the relationships?

when you give Query instructions about columns, it always needs to
know where in the "path" those columns are present.   you could have
multiple relationships from Task that join to Comment:

class Task:
    new_comments = relationship(Comment, primaryjoin=task.id ==
comment.new_task_id)
    old_comments = relationship(Comment, primaryjoin=task.id ==
comment.old_task_id)

So above, if you wanted to only load the columns on Task.new_comments
and not Task.old_comments, that's why this explicitness is necessary.
  I could above say:

   query(Task).options(
        defaultload(Task.new_comments).load_only(Comment.authorname),
       defaultload(Task.old_comments).load_only(Comment.date))

the two paths are:

Task / new_comments / Comment / authorname
Task / old_comments / Comment / date

The same entities can be present in multiple places in an entity query.



> I was hoping to make this part very simple for the people using the Model 
> where they could simply say - Task.query().load_only(comments)  (or 
> with_entities, or....  something of similar complexity) without needing to 
> understand the underpinnings of the model.

I would consider that load_only() is itself a SQL performance
optimization so it's hard to expect users to know how to apply
performance optimizations to a query while simultaneously not having
knowledge of the structure of the tables they want to optimize.

>
> Thanks again for the help
>
>
> On Wednesday, January 9, 2019 at 12:38:13 AM UTC-5, Mike Bayer wrote:
>>
>> every case is now listed out for improved error reporting in:
>>
>> https://github.com/sqlalchemy/sqlalchemy/issues/4433
>>
>>
>> On Wed, Jan 9, 2019 at 12:21 AM Mike Bayer <mik...@zzzcomputing.com> wrote:
>> >
>> > On Tue, Jan 8, 2019 at 4:16 PM Gmoney <grg...@gmail.com> wrote:
>> > >
>> > > Unable to limit the columns returned from a query using .load_only or 
>> > > .with_entities.  I feel like I'm missing a key concept here, and could 
>> > > use some guidance.
>> > >
>> > > One column of interest is a relationship - which I'm guessing is my 
>> > > problem.
>> > > class Task(Base):
>> > >     __tablename__ = 'TASK'
>> > >     id = Column('ID', VARCHAR2(250), primary_key=True)
>> > >     subject = Column('SUBJECT', VARCHAR2(250))
>> > >     comments =relationship('Comment', back_populates='task', 
>> > > lazy='joined')
>> > >
>> > > class Comment(Base):
>> > >     __tablename__ = 'COMMENT'
>> > >     id = Column('ID', Integer, primary_key=True)
>> > >     msg_id = Column('MSG_ID', VARCHAR2(250), ForeignKey(Task.id))
>> > >     comment_given_by = Column('COMMENT_GIVEN_BY', VARCHAR2(250))
>> > >     task = relationship('Task', back_populates='comments', lazy='joined')
>> > >
>> > > Running the query without any attempt to limit columns works fine (ie. 
>> > > the Classes/models seem OK)
>> > > query = session.query(Task)
>> > >
>> > > But now I tried .with_entities.
>> > > If you ONLY request actual Columns, and not relationships, it lists the 
>> > > columns and tables in the select, but does not add the ON clause 
>> > > (cartesian join)
>> > > If I request the relationship, it generates invalid SQL:
>> > > query = session.query(Task).with_entities(Task.id, Task.comments)
>> > >     # SELECT
>> > >     # "TASK"."ID",
>> > >     # "TASK"."ID" = "COMMENT"."MSG_ID" AS comments
>> > >     # FROM
>> > >     # "TASK",
>> > >     # "COMMENT"
>> > >
>> > >
>> > > So trying with .load_only:
>> > > No luck many ways, all failing... every random thing I try results in a 
>> > > different exception - I feel like I'm getting farther and just guessing 
>> > > now
>> >
>> > The documentation for what you are trying to do is at:
>> >
>> > https://docs.sqlalchemy.org/en/latest/orm/loading_columns.html#deferred-loading-with-multiple-entities
>> >
>> > When there are multiple entities, the Query never jumps between two
>> > entities without you specifying how you get there.    in this case,
>> > the way you navigate between them is via Task.comments, so that
>> > navigation must be stated:
>> >
>> > session.query(Task).options(load_only(Task.id),
>> > defaultload(Task.comments).load_only(Comment.comment_given_by))
>> >
>> > You in fact did pretty much the same query at the end of your attempts
>> > and I think you probably relied upon print(query) to determine if it
>> > was "working" or not, because your query with subqueryload() works,
>> > you just would not be able to see this by print(query) alone because
>> > subqueryload() emits a second query only when the objects are loaded.
>> >
>> >
>> > >
>> > > query = session.query(Task, Comment).options(load_only(Task.id, 
>> > > Comment.comment_given_by))
>> > > ---->sqlalchemy.exc.ArgumentError: Wildcard loader can only be used with 
>> > > exactly one entity.  Use Load(ent) to specify specific entities.
>> >
>> > load_only means, defer('*') + undefer(Task.id) +
>> > undefer(Comment.comment_given_by).     But the wildcard doesn't know
>> > how to do Task, Comment together, you would need to say
>> > Load(Task).load_only(Task.id),
>> > Load(Comment).load_only(Comment.comment_given_by), but again this is
>> > not what you want because you are asking for a cartesian product in
>> > the above query.   Error message is not great here because it comes
>> > from the defer('*') that you can't see happening here, so that should
>> > be improved.
>> >
>> >
>> >
>> > > query = session.query(Task).options(load_only('id', 'comments'))
>> > > ----> Exception: can't locate strategy for <class 
>> > > 'sqlalchemy.orm.relationships.RelationshipProperty'> (('deferred', 
>> > > False), ('instrument', True))
>> >
>> > load_only() does not apply to relationships like "comments', only to
>> > columns.   again not a great error message here, that could be
>> > improved.
>> >
>> >
>> > >
>> > > query = session.query(Task).options(load_only(Task.id, 
>> > > Comment.comment_given_by))
>> > > ----> sqlalchemy.exc.ArgumentError: Can't find property 
>> > > 'comment_given_by' on any entity specified in this Query.  Note the full 
>> > > path from root (Mapper|Task|TASK) to target entity must be specified.
>> >
>> > Task has nothing to do with Comment unless you state this in terms of
>> > Task.comments, which is what "full path" means
>> >
>> >
>> > >
>> > > query = session.query(Task, Comment).options(load_only(Task.id, 
>> > > Comment.comment_given_by))
>> > > ----> sqlalchemy.exc.ArgumentError: Wildcard loader can only be used 
>> > > with exactly one entity.  Use Load(ent) to specify specific entities.
>> >
>> > you already tried this one above
>> >
>> >
>> > >
>> > > query = session.query(Task).options(load_only(Task.id, Task.comments))
>> > > ----> Exception: can't locate strategy for <class 
>> > > 'sqlalchemy.orm.relationships.RelationshipProperty'> (('deferred', 
>> > > False), ('instrument', True))
>> >
>> > this is the same thing as the other one you did, string attribute
>> > names or Object.attrname, same thing
>> >
>> > >
>> > > query = session.query(Task).options(load_only(Task.comments), 
>> > > subqueryload("comments").load_only('comment_given_by'))
>> > > ----> No exception but only generates:  "SELECT "TASK"."ID" FROM "TASK"
>> >
>> > I'm not sure you tested this correctly, this one actually works - the
>> > subqueryload("comments") cancels out the load_only(Task.comments) that
>> > would normally raise an error.  Subqueryload is emitted as a second
>> > query, so if you just said print(query), you would not have seen it.
>> >
>> > Here is what it emits if I run on a model like this:
>> >
>> > s.query(A).options(load_only(A.bs), 
>> > subqueryload(A.bs).load_only(B.data)).all()
>> >
>> > output:
>> >
>> > SELECT a.id AS a_id
>> > FROM a
>> >
>> > SELECT b.id AS b_id, b.data AS b_data, anon_1.a_id AS anon_1_a_id
>> > FROM (SELECT a.id AS a_id
>> > FROM a) AS anon_1 JOIN b ON anon_1.a_id = b.a_id ORDER BY anon_1.a_id
>> >
>> >
>> >
>> > >
>> > > --
>> > > SQLAlchemy -
>> > > The Python SQL Toolkit and Object Relational Mapper
>> > >
>> > > http://www.sqlalchemy.org/
>> > >
>> > > To post example code, please provide an MCVE: Minimal, Complete, and 
>> > > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>> > > description.
>> > > ---
>> > > You received this message because you are subscribed to the Google 
>> > > Groups "sqlalchemy" group.
>> > > To unsubscribe from this group and stop receiving emails from it, send 
>> > > an email to sqlalchemy+...@googlegroups.com.
>> > > To post to this group, send email to sqlal...@googlegroups.com.
>> > > Visit this group at https://groups.google.com/group/sqlalchemy.
>> > > For more options, visit https://groups.google.com/d/optout.
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
> ---
> You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] How to use load_only or with_entities with relationships

Reply via email to