Re: [sqlalchemy] Marshmallow speed and lazy/eager loading

Mike Bayer Sat, 21 Sep 2019 14:10:09 -0700


On Sat, Sep 21, 2019, at 6:15 AM, YKdvd wrote:
> I'm using the "marshmallow" serialization library to dump out SQLAlchemy 
> objects, and I'm trying to track down some slowness issues. The objects have 
> various relationships (using marshmallow's Nested fields in the mm schema), 
> and some of those have their own relationships, etc. As an example with round 
> numbers, I might have 50 objects. The SQLAlchemy query might take 0.01 
> seconds with lazy loading, and 0.1 seconds with some obvious relationships 
> given eagerloading options on the query. 
> 
> The marshmallow dump might then take 5 seconds. But if I immediately make the 
> dump call again, it might take only 0.5 seconds - even with a new instance of 
> the marshmallow schema, which would seem to indicate it isn't purely poor 
> performance by marshmallow. I've been assuming the issue is SQLAlchemy 
> lazyloading triggers being fired off as marshmallow accesses the various 
> relationships in the 50 objects while dumping them - on the second dump, all 
> the triggers have been sprung, and there's no SQLAlchemy overhead as 
> marshmallow accesses the objects.
> 
> I've tried adding some eagerloading options (joinedload or subqueryload) for 
> some of the relationships, and this does seem to slightly improve things for 
> the first dump, but nothing like the half second time the second one takes. 
> Is there any way to configure a query to tell it to eagerload all possible 
> relationships instead of specifying them individually, just so I know I 
> haven't missed any?


I wouldn't "assume" the problem is lazyloading until you can see what it is 
actually doing. first step is turn on SQL echoing, and look for queries being 
emitted when this serialization occurs. Then apply appropriate eagerloading to 
relationships manually, and when the "serialize" step emits no SQL then you 
know the loading is not the problem. 

I don't know how marshmallow works or what it does but I would not assume the 
performance problem is anywhere in particular until its understood. marshmallow 
could be caching the serialization or something like that. so after ensuring no 
lazy loads occur, I'd look into cProfile to see where the work is being done, 
see https://docs.sqlalchemy.org/en/13/faq/performance.html#code-profiling . at 
the very least it will tell you what is so different between the first and 
second serializations.


> And is there any other one-time SQLAlchemy overhead other than lazyloading 
> that might cause such difference between marshmallow's first and second use 
> of the objects? 

there is one form of "lazy loading" which is for many-to-one relationships, 
which will often retrieve the target object from the identity map without 
emitting SQL. This is a fast operation but not as fast as when that attribute 
is already loaded. it wouldn't be a major factor unless there are a huge number 
of these going on. as always, the cProfile will show you where the work is 
happening, and if you don't know how to interpret it, you can send the actual 
profile dump (acquired by calling profile.dump_stats(filename), the result is a 
binary file ) on the mailing list here and I can explain where the time is 
being taken.



> 
> 
> 
> 
> I'm currently using 1.18 (until recently we had to support Python 2.6), but I 
> briefly tried swapping in the current 1.3.x and the times seemed similar.
> 
> 
> 

> --
>  SQLAlchemy - 
>  The Python SQL Toolkit and Object Relational Mapper
> 
> http://www.sqlalchemy.org/
> 
>  To post example code, please provide an MCVE: Minimal, Complete, and 
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> description.
>  --- 
>  You received this message because you are subscribed to the Google Groups 
> "sqlalchemy" group.
>  To unsubscribe from this group and stop receiving emails from it, send an 
> email to sqlalchemy+unsubscr...@googlegroups.com.
>  To view this discussion on the web visit 
> https://groups.google.com/d/msgid/sqlalchemy/f116a9e8-aa8b-43f8-a3c6-5b1e91fc3b3c%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/sqlalchemy/f116a9e8-aa8b-43f8-a3c6-5b1e91fc3b3c%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sqlalchemy/00fa7804-be91-46ef-8a5d-ae1f32db03ea%40www.fastmail.com.

Re: [sqlalchemy] Marshmallow speed and lazy/eager loading

Reply via email to