On Feb 6, 2014, at 12:34 PM, Rick Otten <rottenwindf...@gmail.com> wrote:
> Hello, > > I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3. I have a situation where I > need to run the same set of queries millions of times in one session. I > think it would be very helpful from a performance perspective if I could > prepare my query ahead of time. (The data under the queries is not changing > significantly during a run.) > > Is this possible? Is SQLAlchemy already doing it behind the scenes for me > magically? the Python DBAPI (see http://www.python.org/dev/peps/pep-0249/) doesn’t have an explicit “prepared statements” construct. what it does have is the “executemany()” command, which passes a statement and a list of parameter sets to the DBAPI, which can then make the decision to use prepared statements or whatever other means it prefers to optimize the approach. SQLAlchemy documents the use of executemany() most fully in the Core tutorial: http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html#executing-multiple-statements > I was imagining/hoping I'd find something like this: > > # prepare the query: > myPreparedQuery = mySession.query(stuff).filter(parameter > definitions).prepare() > > # run the query whenever I need it during my session: > myPreparedQuery.parameters(stuff).fetchall() prepared statements don’t really apply much to SELECT statements, the performance gains from such are marginal as you typically invoke a particular SELECT just once within a transaction, and prepared statements don’t necessarily carry across transaction or connection boundaries on backends. There are various performance concerns related to SELECT queries and solving them depends much on identifying where a particular performance issue resides. There’s overhead on the side of Python/SQLAlchemy formulating the string statement, theres overhead on the side of passing it to the database to be parsed (which is by far the most infinitesimal part) there’s overhead on retrieving raw rows and columns and there’s python/SQLAlchemy overhead on marshaling those rows into Python objects, or particularly ORM objects which are very expensive relatively. For these areas of overhead there are different strategies, some more exotic than others, of minimizing overhead, and you’d want to dig into each one individually after doing profiling. For background on profiling, see http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677 . I also have a writeup on the performance differences as one moves between core and ORM as well as between executemany() and non, which is here: http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly/11769768#11769768 .
signature.asc
Description: Message signed with OpenPGP using GPGMail