On Feb 6, 2014, at 12:34 PM, Rick Otten <rottenwindf...@gmail.com> wrote:

> Hello,
> 
> I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3.  I have a situation where I 
> need to run the same set of queries millions of times in one session.  I 
> think it would be very helpful from a performance perspective if I could 
> prepare my query ahead of time.  (The data under the queries is not changing 
> significantly during a run.)
> 
> Is this possible?  Is SQLAlchemy already doing it behind the scenes for me 
> magically?

the Python DBAPI (see http://www.python.org/dev/peps/pep-0249/) doesn’t have an 
explicit “prepared statements” construct.  what it does have is the 
“executemany()” command, which passes a statement and a list of parameter sets 
to the DBAPI, which can then make the decision to use prepared statements or 
whatever other means it prefers to optimize the approach.

SQLAlchemy documents the use of executemany() most fully in the Core tutorial:

http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html#executing-multiple-statements


> I was imagining/hoping I'd find something like this:
> 
>    # prepare the query:
>    myPreparedQuery = mySession.query(stuff).filter(parameter 
> definitions).prepare()
> 
>    # run the query whenever I need it during my session:
>    myPreparedQuery.parameters(stuff).fetchall()

prepared statements don’t really apply much to SELECT statements, the 
performance gains from such are marginal as you typically invoke a particular 
SELECT just once within a transaction, and prepared statements don’t 
necessarily carry across transaction or connection boundaries on backends.

There are various performance concerns related to SELECT queries and solving 
them depends much on identifying where a particular performance issue resides.  
 There’s overhead on the side of Python/SQLAlchemy formulating the string 
statement, theres overhead on the side of passing it to the database to be 
parsed (which is by far the most infinitesimal part) there’s overhead on 
retrieving raw rows and columns and there’s python/SQLAlchemy overhead on 
marshaling those rows into Python objects, or particularly ORM objects which 
are very expensive relatively.

For these areas of overhead there are different strategies, some more exotic 
than others, of minimizing overhead, and you’d want to dig into each one 
individually after doing profiling.  For background on profiling, see 
http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677
 .  

I also have a writeup on the performance differences as one moves between core 
and ORM as well as between executemany() and non, which is here: 
http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly/11769768#11769768
 .


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to