Hey Mike, Thanks for the detailed response. I will try looking at my queries and post back soon. Also thanks for the heads up on sqlalchemy update :)
Nikunj On Saturday, January 14, 2017 at 9:09:11 AM UTC-8, Mike Bayer wrote: > > > On 01/14/2017 03:06 AM, 'Nikunj Yadav' via sqlalchemy wrote: > > Hey guys, > > I have a flask app using sqlalchemy. It is a read intensive app but > > writes are not frequent. > > > > SQLAlchemy==0.9.0 > > > > Going through the documentation I know about various APIs that I can > > use. But I am interested in knowing, assuming the dumbest setting that I > > could have done is it possible that sqlalchemy is keeping a lot of > > references in memory ? > > > > What is the best practice around this ? > > There are a few angles to the "running out of memory" issue. > > The most basic one is this: somewhere in your application, just once, > something did a SELECT that results in lots of rows, like, 100 thousand > or maybe even a million. Maybe you only fetched one, or ten, of those > rows, and you thought that this didn't use a lot of memory. However, > all the Python DBAPIs by default will load *all* the rows into memory > returned by that SELECT, before they give SQLAlchemy just the first row > at all. > > Once the DBAPI loads in a ton of rows, the Python interpreter has to > expand its memory footprint (e.g. what you'd see in top), and then it > never gets returned. The Python interpreter generally doesn't return > all memory to the OS once it allocates it, this is a complicated issue > that also changes with different Python versions, there's some good > discussion at > http://stackoverflow.com/questions/15455048/releasing-memory-in-python > on this as well as > > http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm, > > > which is an easier read but probably has outdated information in some > respects. > > Continuing on that angle, the "Python VM doesn't return all the memory > to the OS" issue is also a problem if your application does at some > point load in many thousands of ORM objects at once, and then throws > them away - ORM objects are a lot more heavyweight than a plain row from > a cursor, so if you at some point handle a large query in memory like > that, you'll blow up the memory used by the interpreter. The ORM Query > object, building on the assumption that the DBAPI generally loaded all > the rows in anyway, also doesn't give you back the first object until it > has processed the entire DBAPI rowset into ORM objects. You can alter > this behavior using the yield_per() API call: > > http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per > > . You'll note it has a lot of caveats with eager loading. > > The best practice for the above issues is simply to never SELECT more > rows than you'd like to have in memory at once. This means if you are > paging through a million rows, use limiting / windowing / criteria to > ensure you only get a few hundred at a time. The naive approach to > this is to use LIMIT/OFFSET, however OFFSET has the issue that the > database has to scan through all the previous rows, limiting > performance. For a better way, you need to have WHERE criteria that > gets just a chunk of rows at a time - the recipe at > > https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/WindowedRangeQuery > shows a technique I use often when doing reporting-style applications - > there is also the probably simpler approach detailed at > http://use-the-index-luke.com/no-offset ; make sure you are ordering, > and on each chunk, get a handle to something you can use to get the next > chunk. > > All of that said, there's another memory angle too, depending on what > you're getting at. Since you are on 0.9, there was a large pass made at > structural memory use in 1.0 which you might be interested in. This > change basically means lots of internal objects used by table metadata, > mappings, and events were made to use __slots__, to dramatically reduce > the memory footprint of a large set of mappings. This won't save you > from using up a lot of memory if you load a million rows, but if you > have a large application with hundreds of mappings, using 1.0 should > save you a significant chunk of memory when your application first > starts up and loads all of its mappings. > > In general, if you want to do memory profiling on your application I'd > recommend looking at heapy http://guppy-pe.sourceforge.net/ . It's > stylistically old school but it does the job. It's what I used to > target the best areas for the __slots__ overhaul in 1.0. > > > > > > > > > > > > > > > > > > > > > -- > > SQLAlchemy - > > The Python SQL Toolkit and Object Relational Mapper > > > > http://www.sqlalchemy.org/ > > > > To post example code, please provide an MCVE: Minimal, Complete, and > > Verifiable Example. See http://stackoverflow.com/help/mcve for a full > > description. > > --- > > You received this message because you are subscribed to the Google > > Groups "sqlalchemy" group. > > To unsubscribe from this group and stop receiving emails from it, send > > an email to sqlalchemy+...@googlegroups.com <javascript:> > > <mailto:sqlalchemy+unsubscr...@googlegroups.com <javascript:>>. > > To post to this group, send email to sqlal...@googlegroups.com > <javascript:> > > <mailto:sqlal...@googlegroups.com <javascript:>>. > > Visit this group at https://groups.google.com/group/sqlalchemy. > > For more options, visit https://groups.google.com/d/optout. > -- SQLAlchemy - The Python SQL Toolkit and Object Relational Mapper http://www.sqlalchemy.org/ To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description. --- You received this message because you are subscribed to the Google Groups "sqlalchemy" group. To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+unsubscr...@googlegroups.com. To post to this group, send email to sqlalchemy@googlegroups.com. Visit this group at https://groups.google.com/group/sqlalchemy. For more options, visit https://groups.google.com/d/optout.