On Wed, May 22, 2019, at 10:29 PM, bvbav wrote:
> Ok, so yes I see those queries for the two examples you posted.
> 
> For my original model (first post), with dynamic it takes about 6sec to load 
> ~70,000 amount of data, going by the first post then this means all residents 
> and pets from an address. Which I would like to speed up.
> 
> So I tried the eager load selectin (because I thought dynamic affects all() - 
> which you mentioned it does not), and I get the query from example 2. But 
> this one is slower because there are more queries - I used a profiler 
> (sqltap) and it shows the query count as being 3 times as much.

The "selectin" load also batches in groups of 500, because a very large "IN()" 
clause starts to make the text of the query too large and some databases like 
Oracle have a limit on the size of an IN() in any case. This is still vastly 
fewer queries than loading once per relationship per object. if you want no 
batching at all use "subquery" loading instead, however these queries will have 
a more complex execution plan which may or may not be slower.


> 
> So ideally, I'd like to somehow query/join to get those ~70,000 records 
> within a faster amount of time, rather than 6 secs. Is there a proper way to 
> do this with eager loading or perhaps another way?

you also need to determine where the time is actually being spent. It could be 
on the invocation of the query, in which case fewer queries would be better, it 
could be on the interpretation of the query (e.g. the plan), it could be for 
the client to fetch the rows over the network, it could be that much time taken 
up in Python to build the objects up which is also a slow process and is likely 
where most of the time is spent; to speed that up, you'd fetch tuples or rows 
instead of objects, but you then lose all the convenience of traversing through 
each relationship() when you get your rows back. There's a lot of detail on how 
to approach profiling at 
https://docs.sqlalchemy.org/en/13/faq/performance.html#how-can-i-profile-a-sqlalchemy-powered-application
 as well as demonstrations at 
https://docs.sqlalchemy.org/en/13/orm/examples.html#examples-performance .



> 
> 
> 
> On Wednesday, May 22, 2019 at 4:49:58 PM UTC-7, Mike Bayer wrote:
>> 
>> 
>> On Wed, May 22, 2019, at 6:44 PM, bvbav wrote:
>>> Thanks for replying, but your response doesn't really help. Perhaps my post 
>>> wasn't clear...
>> 
>> not really sure, I answered what I thought I was seeing, and having gone 
>> through your new response, I am still seeing the same thing being asked, so 
>> what we do is we start using code to answer the question rather than words. 
>> that way there's no ambiguity.
>> 
>>> 
>>> I'm aware the "dynamic" relationship returns a query, and not a collection. 
>>> It's mainly used for large collections and when someone wants to perform 
>>> filters, etc instead of getting the collection as a list.
>>> 
>>> Because of the nature of dynamic, when I do that join (re-posted here), it 
>>> emits a SELECT statement for every attribute in the relationship. Correct 
>>> me if I'm wrong on this. So the join query is slow for extremely large 
>>> amount.
>>> 
>>> db.session.query(Address, Resident, Pet).filter(Address.id == 
>>> 1).join(Address.residents).join(Resident.pets).all()
>> 
>> So above, is "residents" the relationship with lazy="dynamic" configured? 
>> Whether or not it is, "Address.residents" is at the class level. This 
>> returns a SQL expression object. It does not invoke any kind of dynamic 
>> query behavior at query time. That is, when the "all()" above has completed, 
>> there will be exactly one "SELECT" being emitted. If that is not what you 
>> are seeing, then there is something much more elaborate going on in your 
>> case like an event listener. See Example 1 at the end of the email which 
>> illustrates this.
>> 
>> Now when we go beyond the call to all(), and we have a whole list of objects 
>> that you now iterate through, and Address.residents is in fact the "dynamic" 
>> relationship, and you access it at the instance level, it will run a new 
>> Query all over again. This is the expected behavior if Address.residents is 
>> lazy="dynamic" and there is no way to change that except to not use 
>> lazy="dynamic".
>> 
>> So are we on the same page? Otherwise if you see something different we have 
>> to clear that part up first.
>> 
>> 
>> 
>> 
>>> Here I'm basically getting all residents and pets that live at an address.
>>> 
>>> Because it's slow, I thought switching to eager loading would work. I made 
>>> two relationships to keep the dynamic an option - shouldn't have included 
>>> as that caused confusion. So ignore "residents_dynamic" etc.
>>> 
>>> But with the "selectin" loading, it resulted in 3x the number of queries 
>>> while using that same join above. 
>> 
>> So when you say 3x the number of queries, I don't know what we are 
>> multiplying by three since the previous thing you describe is not what 
>> happens based on your description, "SELECT IN" does not emit a query per 
>> object withn the scope of the parent Query. You might be seeing 1. per 
>> object lazy (or dynamic?) loads that aren't actually "selectin" loads, or 2. 
>> "selectin"loads that are a secondary effect of each of the per-object loads 
>> in #1. In both cases the problem is the same, that it seems you are seeing 
>> per-object loads going on that aren't "selectin" loads.
>> 
>> Example 2 below illustrates what I think is the usage you are describing. At 
>> the end, even though there are 30 objects there are only four queries. The 
>> main one, then two SELECT IN queries for A->B->C, then another SELECT IN 
>> query that I think is due to B->C, because it queries for A and B at the top 
>> level, which is slightly more queries than we'd prefer but still nothing 
>> dramatic.
>> 
>> the way to get this worked out is to please provide a self-contained, short 
>> proof of concept that creates the most minimal mapping possible, installs 
>> the most minimal amount of test data possible into a sqlite:// memory 
>> database, then illustrates the query you are running (this is also known as 
>> an MCVE). Because I don't really know what you are seeing. Please confirm 
>> you are also on the SQLAlchemy 1.3 series.
>> 
>> 
>> 
>> 
>> Example 1 - dynamic relationship has no effect on all():
>> 
>> from sqlalchemy import *
>> from sqlalchemy.orm import *
>> from sqlalchemy.ext.declarative import declarative_base
>> from sqlalchemy.ext.declarative import declared_attr
>> 
>> Base = declarative_base()
>> 
>> 
>> class A(Base):
>>  __tablename__ = 'a'
>> 
>>  id = Column(Integer, primary_key=True)
>>  data = Column(String)
>>  bs = relationship("B", lazy="dynamic")
>> 
>> 
>> class B(Base):
>>  __tablename__ = 'b'
>>  id = Column(Integer, primary_key=True)
>>  a_id = Column(ForeignKey("a.id"))
>>  data = Column(String)
>>  cs = relationship("C", lazy="dynamic")
>> 
>> class C(Base):
>>  __tablename__ = 'c'
>>  id = Column(Integer, primary_key=True)
>>  b_id = Column(ForeignKey("b.id"))
>>  data = Column(String)
>> 
>> e = create_engine("sqlite://", echo=True)
>> Base.metadata.create_all(e)
>> 
>> s = Session(e)
>> 
>> s.add_all([
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>> ])
>> s.commit()
>> s.close()
>> 
>> 
>> s.query(A, B, C).filter(A.id == 1).join(A.bs).join(B.cs).all()
>> 
>> 
>> single query at the end after the INSERTs:
>> 
>> SELECT a.id AS a_id, a.data AS a_data, b.id AS b_id, b.a_id AS b_a_id, 
>> b.data AS b_data, c.id AS c_id, c.b_id AS c_b_id, c.data AS c_data
>> FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id
>> WHERE a.id = ?
>> 
>> 
>> 
>> Example 2: selectin is one query per class-level relationship, not per 
>> instance:
>> 
>> from sqlalchemy import *
>> from sqlalchemy.orm import *
>> from sqlalchemy.ext.declarative import declarative_base
>> from sqlalchemy.ext.declarative import declared_attr
>> 
>> Base = declarative_base()
>> 
>> 
>> class A(Base):
>>  __tablename__ = 'a'
>> 
>>  id = Column(Integer, primary_key=True)
>>  data = Column(String)
>>  bs_dyn = relationship("B", lazy="dynamic")
>>  bs = relationship("B", lazy="selectin")
>> 
>> class B(Base):
>>  __tablename__ = 'b'
>>  id = Column(Integer, primary_key=True)
>>  a_id = Column(ForeignKey("a.id"))
>>  data = Column(String)
>>  cs_dyn = relationship("C", lazy="dynamic")
>>  cs = relationship("C", lazy="selectin")
>> 
>> class C(Base):
>>  __tablename__ = 'c'
>>  id = Column(Integer, primary_key=True)
>>  b_id = Column(ForeignKey("b.id"))
>>  data = Column(String)
>> 
>> e = create_engine("sqlite://", echo=True)
>> Base.metadata.create_all(e)
>> 
>> s = Session(e)
>> 
>> s.add_all([
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>>  A(bs=[B(cs=[C(), C()]), B(cs=[C(), C()]), B(cs=[C(), C()])]),
>> ])
>> s.commit()
>> s.close()
>> 
>> 
>> s.query(A, B, C).filter(A.id == 1).join(A.bs).join(B.cs).all()
>> 
>> 
>> queries at the end which are, the s.query(), the two SELECT IN for A.bs -> 
>> B.cs, then a SELECT IN for B->cs since it is specified at the top level as 
>> well:
>> 
>> SELECT a.id AS a_id, a.data AS a_data, b.id AS b_id, b.a_id AS b_a_id, 
>> b.data AS b_data, c.id AS c_id, c.b_id AS c_b_id, c.data AS c_data
>> FROM a JOIN b ON a.id = b.a_id JOIN c ON b.id = c.b_id
>> WHERE a.id = ?
>> 2019-05-22 19:39:25,156 INFO sqlalchemy.engine.base.Engine (1,)
>> 2019-05-22 19:39:25,160 INFO sqlalchemy.engine.base.Engine SELECT b.a_id AS 
>> b_a_id, b.id AS b_id, b.data AS b_data
>> FROM b
>> WHERE b.a_id IN (?) ORDER BY b.a_id
>> 2019-05-22 19:39:25,160 INFO sqlalchemy.engine.base.Engine (1,)
>> 2019-05-22 19:39:25,161 INFO sqlalchemy.engine.base.Engine SELECT c.b_id AS 
>> c_b_id, c.id AS c_id, c.data AS c_data
>> FROM c
>> WHERE c.b_id IN (?, ?, ?) ORDER BY c.b_id
>> 2019-05-22 19:39:25,162 INFO sqlalchemy.engine.base.Engine (1, 2, 3)
>> 2019-05-22 19:39:25,163 INFO sqlalchemy.engine.base.Engine SELECT c.b_id AS 
>> c_b_id, c.id AS c_id, c.data AS c_data
>> FROM c
>> WHERE c.b_id IN (?, ?, ?) ORDER BY c.b_id
>> 2019-05-22 19:39:25,164 INFO sqlalchemy.engine.base.Engine (1, 2, 3)
>> 
>> 
>> 
>> 
>> 
>> 
>>> 
>>> 
>>> 
>>> 
>>> So I was wondering if anyone knows of a way to do that query so it is fast. 
>>> Perhaps the 3x number of queries is because it eager loads 
>>> Address.residents and Resident.pets, then since I'm also joining, it adds 
>>> onto that? Or perhaps I need to change my query to work with "selectin"? 
>>> 
>>> 
>>> On Wednesday, May 22, 2019 at 1:33:11 PM UTC-7, Mike Bayer wrote:
>>>> 
>>>> 
>>>> On Wed, May 22, 2019, at 3:56 PM, bvbav wrote:
>>>>> Hi
>>>>> 
>>>>> I was wondering if I can get some advice on this...
>>>>> 
>>>>> Let's say I have these models:
>>>>> 
>>>>> class Pet(db.Model):
>>>>>  id = db.Column(..., primary_key=True)
>>>>>  name = db.Column(...)
>>>>> 
>>>>> 
>>>>> resident_pet = db.Table('resident_pet', db.Column('resident_id', 
>>>>> db.Integer, db.ForeignKey('resident.id'), primary_key=True), db.Column
>>>>> `('pet_id', db.Integer, db.ForeignKey('pet.id'), primary_key=True)
>>>>> 
>>>>> class Resident(db.Model):`
>>>>> `` id = db.Column(..., primary_key=True)
>>>>>  name = db.Column(...)``
>>>>> `````` pets``` = db.relationship('Resident', secondary=resident_pet, 
>>>>> lazy='dynamic')```
>>>>> ``````resident_address``` = db.Table('`````````resident_address```', 
>>>>> db.Column('resident_id', db.Integer, db.ForeignKey('resident.id'), 
>>>>> primary_key=True), db.Column
>>>>> `('address_id', db.Integer, db.ForeignKey('address.id'), primary_key=True)
>>>>> 
>>>>> 
>>>>> class Address(db.Model):
>>>>>  ...
>>>>>  residents = ````````````````db.relationship('Resident', 
>>>>> secondary=resident_address, lazy='dynamic')``````````
>>>>> 
>>>>> Now I want to query tables Address, Resident, Pet like this:
>>>>> 
>>>>> db.session.query(Address, Resident, Pet).filter(Address.id == 
>>>>> 1).join(Address.residents).join(Resident.pets).all()
>>>>> 
>>>>> Now, let's say hypothetically an address (represents a home) can have a 
>>>>> large number of residents (e.g 10,000+), and each resident can have the 
>>>>> same number of pets (e.g 10,000+). This is the reason why I used 
>>>>> lazy='dynamic', so I can do quick filters etc. But this causes a problem 
>>>>> when I do joins like the one above. Because with the dynamic loading, 
>>>>> individual SELECT queries will be emitted for every pet for every 
>>>>> resident which can result in a low performance? Is my assumption correct?
>>>>> 
>>>>> I thought about switching to 'selectin' while keeping 'dynamic', so there 
>>>>> would be two relationships
>>>>> 
>>>>> `````class Address(db.Model):
>>>>>  ...`````
>>>>> ``````````` residents = ````````````````db.relationship('Resident', 
>>>>> secondary=resident_address, lazy='selectin')
```````````` residents_dynamic = ````````````````db.relationship('Resident', 
secondary=resident_address, lazy='dynamic')```````````
>>>>> 
>>>>> However, doing this (while keeping the same JOIN query above, results in 
>>>>> a lot more queries and take a longer amount of time to finish. I also 
>>>>> figured since I'm querying everything, there's no point in eager loading 
>>>>> Resident.pets, because it's not like I'm getting one instance of Resident 
>>>>> and expect to have the pets collection already there.
>>>>> 
>>>>> How can I get better performance for this kind of join?
>>>> 
>>>> the mechanism of "dynamic" loading is that your ORM instance contains a 
>>>> link to a special Query object, and it does not store any Python list of 
>>>> items. This is entirely different from what relationship does in all other 
>>>> cases, which is that your ORM instance has a Python list, set, or other 
>>>> kind of collection that simply stores the related objects in memory once 
>>>> loaded. So it is not possible for a single relationship to perfom in both 
>>>> a "dynamic" as well as a fixed "eager load" style of use.
>>>> 
>>>> Therefore, for normal use of your objects when you don't need to add 
>>>> additional filter criteria, you should use the "residents" relatoinship 
>>>> that does not use "dynamic" loading. Only when you need to make a special 
>>>> SQL query that is essentially Address.residents + extra SQL filtering, 
>>>> would the "residents_dynamic" relationship be used, as it by definition 
>>>> has to query the database on each use (else it wouldn't be "dynamic"). 
>>>> hope this helps.
>>>> 
>>>> 
>>>>> 

>>>>> --
>>>>> SQLAlchemy - 
>>>>> The Python SQL Toolkit and Object Relational Mapper
>>>>> 
>>>>> http://www.sqlalchemy.org/
>>>>> 
>>>>> To post example code, please provide an MCVE: Minimal, Complete, and 
>>>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>>>>> description.
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "sqlalchemy" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to sqlal...@googlegroups.com.
>>>>> To post to this group, send email to sqlal...@googlegroups.com.
>>>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/sqlalchemy/ab7a84b6-225a-40ec-8615-1ecfe7cf4ce9%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/sqlalchemy/ab7a84b6-225a-40ec-8615-1ecfe7cf4ce9%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>> 
>>> 

>>> --
>>> SQLAlchemy - 
>>> The Python SQL Toolkit and Object Relational Mapper
>>> 
>>> http://www.sqlalchemy.org/
>>> 
>>> To post example code, please provide an MCVE: Minimal, Complete, and 
>>> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
>>> description.
>>> --- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "sqlalchemy" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to sqlal...@googlegroups.com.
>>> To post to this group, send email to sqlal...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/sqlalchemy.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/sqlalchemy/cd7b05b0-97ee-49e6-8159-2bd6aff34763%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/sqlalchemy/cd7b05b0-97ee-49e6-8159-2bd6aff34763%40googlegroups.com?utm_medium=email&utm_source=footer>.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
> 

> --
>  SQLAlchemy - 
>  The Python SQL Toolkit and Object Relational Mapper
> 
> http://www.sqlalchemy.org/
> 
>  To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
>  --- 
>  You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
>  To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
>  To post to this group, send email to sqlalchemy@googlegroups.com.
>  Visit this group at https://groups.google.com/group/sqlalchemy.
>  To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/9824f5b0-b4f2-463a-b6ff-256aef5f6b49%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/9824f5b0-b4f2-463a-b6ff-256aef5f6b49%40googlegroups.com?utm_medium=email&utm_source=footer>.
>  For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/36eb9f1c-9bee-4bc7-b824-15c5f59d7002%40www.fastmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to