On Sat, Sep 21, 2019, at 6:15 AM, YKdvd wrote: > I'm using the "marshmallow" serialization library to dump out SQLAlchemy > objects, and I'm trying to track down some slowness issues. The objects have > various relationships (using marshmallow's Nested fields in the mm schema), > and some of those have their own relationships, etc. As an example with round > numbers, I might have 50 objects. The SQLAlchemy query might take 0.01 > seconds with lazy loading, and 0.1 seconds with some obvious relationships > given eagerloading options on the query. > > The marshmallow dump might then take 5 seconds. But if I immediately make the > dump call again, it might take only 0.5 seconds - even with a new instance of > the marshmallow schema, which would seem to indicate it isn't purely poor > performance by marshmallow. I've been assuming the issue is SQLAlchemy > lazyloading triggers being fired off as marshmallow accesses the various > relationships in the 50 objects while dumping them - on the second dump, all > the triggers have been sprung, and there's no SQLAlchemy overhead as > marshmallow accesses the objects. > > I've tried adding some eagerloading options (joinedload or subqueryload) for > some of the relationships, and this does seem to slightly improve things for > the first dump, but nothing like the half second time the second one takes. > Is there any way to configure a query to tell it to eagerload all possible > relationships instead of specifying them individually, just so I know I > haven't missed any?
I wouldn't "assume" the problem is lazyloading until you can see what it is actually doing. first step is turn on SQL echoing, and look for queries being emitted when this serialization occurs. Then apply appropriate eagerloading to relationships manually, and when the "serialize" step emits no SQL then you know the loading is not the problem. I don't know how marshmallow works or what it does but I would not assume the performance problem is anywhere in particular until its understood. marshmallow could be caching the serialization or something like that. so after ensuring no lazy loads occur, I'd look into cProfile to see where the work is being done, see https://docs.sqlalchemy.org/en/13/faq/performance.html#code-profiling . at the very least it will tell you what is so different between the first and second serializations. > And is there any other one-time SQLAlchemy overhead other than lazyloading > that might cause such difference between marshmallow's first and second use > of the objects? there is one form of "lazy loading" which is for many-to-one relationships, which will often retrieve the target object from the identity map without emitting SQL. This is a fast operation but not as fast as when that attribute is already loaded. it wouldn't be a major factor unless there are a huge number of these going on. as always, the cProfile will show you where the work is happening, and if you don't know how to interpret it, you can send the actual profile dump (acquired by calling profile.dump_stats(filename), the result is a binary file ) on the mailing list here and I can explain where the time is being taken. > > > > > I'm currently using 1.18 (until recently we had to support Python 2.6), but I > briefly tried swapping in the current 1.3.x and the times seemed similar. > > > > -- > SQLAlchemy - > The Python SQL Toolkit and Object Relational Mapper > > http://www.sqlalchemy.org/ > > To post example code, please provide an MCVE: Minimal, Complete, and > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > description. > --- > You received this message because you are subscribed to the Google Groups > "sqlalchemy" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to sqlalchemy+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/sqlalchemy/f116a9e8-aa8b-43f8-a3c6-5b1e91fc3b3c%40googlegroups.com > > <https://groups.google.com/d/msgid/sqlalchemy/f116a9e8-aa8b-43f8-a3c6-5b1e91fc3b3c%40googlegroups.com?utm_medium=email&utm_source=footer>. -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/00fa7804-be91-46ef-8a5d-ae1f32db03ea%40www.fastmail.com.