Re: [sqlalchemy] SqlAlchemy memory usage

'Nikunj Yadav' via sqlalchemy Tue, 17 Jan 2017 11:17:03 -0800

Hey Mike, 
Thanks for the detailed response. I will try looking at my queries and post 
back soon. Also thanks for the heads up on sqlalchemy update :)


Nikunj

On Saturday, January 14, 2017 at 9:09:11 AM UTC-8, Mike Bayer wrote:
>
>
> On 01/14/2017 03:06 AM, 'Nikunj Yadav' via sqlalchemy wrote: 
> > Hey guys, 
> > I have a flask app using sqlalchemy.  It is a read intensive app but 
> > writes are not frequent. 
> > 
> > SQLAlchemy==0.9.0 
> > 
> > Going through the documentation I know about various APIs that I can 
> > use. But I am interested in knowing, assuming the dumbest setting that I 
> > could have done is it possible that sqlalchemy is keeping a lot of 
> > references in memory ? 
> > 
> > What is the best practice around this ? 
>
> There are a few angles to the "running out of memory" issue. 
>
> The most basic one is this:  somewhere in your application, just once, 
> something did a SELECT that results in lots of rows, like, 100 thousand 
> or maybe even a million.   Maybe you only fetched one, or ten, of those 
> rows, and you thought that this didn't use a lot of memory.  However, 
> all the Python DBAPIs by default will load *all* the rows into memory 
> returned by that SELECT, before they give SQLAlchemy just the first row 
> at all. 
>
> Once the DBAPI loads in a ton of rows, the Python interpreter has to 
> expand its memory footprint (e.g. what you'd see in top), and then it 
> never gets returned.  The Python interpreter generally doesn't return 
> all memory to the OS once it allocates it, this is a complicated issue 
> that also changes with different Python versions, there's some good 
> discussion at 
> http://stackoverflow.com/questions/15455048/releasing-memory-in-python 
> on this as well as 
>
> http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm,
>  
>
> which is an easier read but probably has outdated information in some 
> respects. 
>
> Continuing on that angle, the "Python VM doesn't return all the memory 
> to the OS" issue is also a problem if your application does at some 
> point load in many thousands of ORM objects at once, and then throws 
> them away - ORM objects are a lot more heavyweight than a plain row from 
> a cursor, so if you at some point handle a large query in memory like 
> that, you'll blow up the memory used by the interpreter.   The ORM Query 
> object, building on the assumption that the DBAPI generally loaded all 
> the rows in anyway, also doesn't give you back the first object until it 
> has processed the entire DBAPI rowset into ORM objects.   You can alter 
> this behavior using the yield_per() API call: 
>
> http://docs.sqlalchemy.org/en/latest/orm/query.html?highlight=yield_per#sqlalchemy.orm.query.Query.yield_per
>  
> .  You'll note it has a lot of caveats with eager loading. 
>
> The best practice for the above issues is simply to never SELECT more 
> rows than you'd like to have in memory at once.  This means if you are 
> paging through a million rows, use limiting / windowing / criteria to 
> ensure you only get a few hundred at a time.    The naive approach to 
> this is to use LIMIT/OFFSET, however OFFSET has the issue that the 
> database has to scan through all the previous rows, limiting 
> performance.  For a better way, you need to have WHERE criteria that 
> gets just a chunk of rows at a time - the recipe at 
>
> https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/WindowedRangeQuery 
> shows a technique I use often when doing reporting-style applications - 
> there is also the probably simpler approach detailed at 
> http://use-the-index-luke.com/no-offset ; make sure you are ordering, 
> and on each chunk, get a handle to something you can use to get the next 
> chunk. 
>
> All of that said, there's another memory angle too, depending on what 
> you're getting at.  Since you are on 0.9, there was a large pass made at 
> structural memory use in 1.0 which you might be interested in.  This 
> change basically means lots of internal objects used by table metadata, 
> mappings, and events were made to use __slots__, to dramatically reduce 
> the memory footprint of a large set of mappings.  This won't save you 
> from using up a lot of memory if you load a million rows, but if you 
> have a large application with hundreds of mappings, using 1.0 should 
> save you a significant chunk of memory when your application first 
> starts up and loads all of its mappings. 
>
> In general, if you want to do memory profiling on your application I'd 
> recommend looking at heapy http://guppy-pe.sourceforge.net/ .   It's 
> stylistically old school but it does the job.  It's what I used to 
> target the best areas for the __slots__ overhaul in 1.0. 
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> > 
> > 
> > -- 
> > SQLAlchemy - 
> > The Python SQL Toolkit and Object Relational Mapper 
> > 
> > http://www.sqlalchemy.org/ 
> > 
> > To post example code, please provide an MCVE: Minimal, Complete, and 
> > Verifiable Example. See http://stackoverflow.com/help/mcve for a full 
> > description. 
> > --- 
> > You received this message because you are subscribed to the Google 
> > Groups "sqlalchemy" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> > an email to sqlalchemy+...@googlegroups.com <javascript:> 
> > <mailto:sqlalchemy+unsubscr...@googlegroups.com <javascript:>>. 
> > To post to this group, send email to sqlal...@googlegroups.com 
> <javascript:> 
> > <mailto:sqlal...@googlegroups.com <javascript:>>. 
> > Visit this group at https://groups.google.com/group/sqlalchemy. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
SQLAlchemy - 
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable 
Example.  See  http://stackoverflow.com/help/mcve for a full description.
--- 
You received this message because you are subscribed to the Google Groups 
"sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at https://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Re: [sqlalchemy] SqlAlchemy memory usage

Reply via email to