On Wed, Jun 6, 2018 at 6:53 AM, Tom Flannaghan <tomflannag...@gmail.com> wrote:
> Hi all,
>
> We have just upgraded to sqlalchemy 1.2.7 (from 1.1.14), and had a
> performance issue with a query that uses a lot of joinedloads that was
> caused by the automatic baking of all relationship queries that was
> introduced in 1.2.
>
> Say we have a set of tables with relationships Book.pages, Page.font and
> Font.layout. We have a query of this form:
>
> pages = joinedload(Book.pages)
> option1 = pages.joinedload(Page.font)
> option2 = pages.joinedload(Page.layout)
>
> query = session().query(Book).options(option1, option2)
>
> The important point here is that the pages object defined on line 1 is
> reused in both option1 and option2. Now suppose we fetch another
> relationship that wasn't joined-loaded on the returned instances. This will
> case another query as it is not loaded already, and this query will be baked
> due to the change in 1.2 to bake all relationship loads.
>
> We found that the construction of the cache key for baking this query
> becomes very slow as the number of options of this form increases, and is in
> fact quadratic in the number of such options (we have ~25 such options in
> our problematic query). This is due to each option containing all of the
> joinedloads inside its _to_bind attribute, and
> _UnboundLoad._generate_cache_key has to process everything in the _to_bind
> list. E.g. in this example:
>
> print([[str(i) for i in load.path] for load in option1._to_bind])
> print([[str(i) for i in load.path] for load in option2._to_bind])
>
> [['Book.pages'], ['Book.pages', 'Page.font'], ['Book.pages', 'Page.layout']]
>
> [['Book.pages'], ['Book.pages', 'Page.font'], ['Book.pages', 'Page.layout']]
>
>
> Therefore, when generating the key for each option we are processing the
> joinedloads from all of the options, leading to the quadratic performance
> degradation.
>
> We fixed it by avoiding reusing the joinedload for Book.pages by doing this:
>
> option1 = joinedload(Book.pages).joinedload(Page.font)
> option2 = joinedload(Book.pages).joinedload(Page.layout)
>
> The resulting query is unchanged, but the cache key function is now just
> linear in the number of joinedloads as each option has only its
> relationships in its _to_bind attribute. In our case, this completely solved
> the performance issue.
>
> I'm not sure whether this behaviour is a bug or whether joinedloads aren't
> intended to be reused. If the latter, it would be great if they raised a
> warning if reused like this (and a mention of this issue in the docs).

I'd classify this as a bug which should be fixed.    I would need to
find time to dig into the structure of the options objects but it
seems as though if the options themselves copied out to a new option,
or the cache key could be smarter about what it's doing.   I don't
understand exactly what you've observed yet but it sounds like in the
re-use case it is spending a lot of time with redundant data that
isn't needed.  this is up at
https://bitbucket.org/zzzeek/sqlalchemy/issues/4270/cache-key-calc-for-loader-options-w-reuse


>
> Thanks,
> Tom
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+unsubscr...@googlegroups.com.
> To post to this group, send email to sqlalchemy@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Reply via email to