Re: [sqlalchemy] Prepared Statements in Postgresql

2014-02-12 Thread Tony Locke
Hi, just to confirm, the executemany() method in pg8000 does use prepared 
statements.

Cheers,

Tony.

On Friday, 15 January 2010 17:16:09 UTC, Michael Bayer wrote:

 mozillalives wrote:
  Hello Everyone,
 
  I am new to both sqlalchemy and elixir, but I have been using them for
  the past couple of weeks and I really like them. But I have a question
  about prepared statements for Postgresql.
 
  For one specific application, I am doing a bunch of inserts
  (200,000+). From what I can tell, it looks like these are not prepared
  statements. I rewrote the code to issue prepared statements and this
  cuts the insertion time in half, but the code is crude. My question's
  are:

 how did you use prepared statements in Python if you don't know that
 psycoopg2 uses prepared statements ?  was this in another language or did
 you implement a raw socket connection to your database ?

 
  Is there a way to tell sqlalchemy or the engine (which would be
  psycopg2, correct?) to use prepared statements?

 to efficiently execute the same statement many times, use the
 executemany style of execution - the tutorial describes this at

 http://www.sqlalchemy.org/docs/05/sqlexpression.html#executing-multiple-statements
 .   I don't think that psycopg2 actually uses prepared statements for
 this purpose but I am not sure.  The DBAPI executemany() method is used.

 
  I've noticed some opinions online indicating that psycopg2 does not
  have prepared statement support (e.g. -
  
 http://www.cerias.purdue.edu/site/blog/post/beware_sql_injections_due_to_missing_prepared_statement_support/
 )

 the comment at the bottom of that post ultimately references a psycopg2
 message from 2007 so you'd need to ask the psycopg2 folks for updated
 information.  However psycopg2 can do an executemany with great
 efficiency as it is using methodologies for which you'd have to ask them,
 so if they don't use PG's actual prepared mechanism, its probably
 unnecessary.  psycopg2 is an extremely mature and high performing product.


  - can I plug another engine into sqlalchemy that does?

 there's the pg8000 engine which may or may not do this.  But its written
 in pure python, is not as fast as psycopg2, and is very new and not widely
 used since its author doesn't seem to promote it very much (but it is a
 very well written library).


 
  If I can't do any of the above and just need to prepare the statements
  manually, is there at least a method in sqlalchemy to properly quote
  my data before sending it to postgres?

 Despite some of the fud-like links mentioned on that blog, SQLAlchemy, as
 it says on the website since the day we launched 5 years ago, always uses
 bind parameters, in all cases, for all literal values, everywhere.  We do
 not and have never quoted anything within SQLA as that is left up to the
 services provided by the DBAPI.   DBAPI does not have prepared statement
 API.  It has executemany(), for which the underlying implementation may
 or may not use prepared statements + server-level bind processing as an
 implementation detail.  Psycopg2 handles the quoting in this case. 
 cx_oracle, OTOH, uses Oracle's native data binding facilities provided by
 OCI.  DBAPI abstracts this detail away.

 
  Thanks,
  Phil
  --
  You received this message because you are subscribed to the Google Groups
  sqlalchemy group.
  To post to this group, send email to sqlal...@googlegroups.comjavascript:
 .
  To unsubscribe from this group, send email to
  sqlalchemy+...@googlegroups.com javascript:.
  For more options, visit this group at
  http://groups.google.com/group/sqlalchemy?hl=en.
 
 
 



-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Prepared Statements in Postgresql

2014-02-12 Thread Claudio Freire
On Wed, Feb 12, 2014 at 5:03 PM, Tony Locke tlo...@tlocke.org.uk wrote:

 I've noticed some opinions online indicating that psycopg2 does not
 have prepared statement support (e.g. -

 http://www.cerias.purdue.edu/site/blog/post/beware_sql_injections_due_to_missing_prepared_statement_support/)

 the comment at the bottom of that post ultimately references a psycopg2
 message from 2007 so you'd need to ask the psycopg2 folks for updated
 information.  However psycopg2 can do an executemany with great
 efficiency as it is using methodologies for which you'd have to ask them,
 so if they don't use PG's actual prepared mechanism, its probably
 unnecessary.  psycopg2 is an extremely mature and high performing product.

What it doesn't support is libpq's wire protocol for prepared
statements. But you can prepare and execute statements by issuing the
corresponding SQL (that will use the wire protocol for SQL execution,
which is a tad less efficient but still more efficient than separate
queries).

psycopg2's executemany isn't much more sophisticated than multiple
separate queries since it internally does exactly that. It may be a
tad faster since it's done in C, but I doubt the difference is
significant.

But, if you want an executemany that works in psycopg2 as it would in
pg8000, you can PREPARE and then executemany the EXECUTE queries.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] Prepared Statements in Postgresql

2014-02-12 Thread Michael Bayer

On Feb 12, 2014, at 4:07 PM, Claudio Freire klaussfre...@gmail.com wrote:

 
 But, if you want an executemany that works in psycopg2 as it would in
 pg8000, you can PREPARE and then executemany the EXECUTE queries.

I’ve worked a lot with pg8000 including that I’ve given them very broad 
architectural changes towards the goal of greater performance, but still as a 
pure Python driver unless you’re using pypy, it still has dramatically more 
overhead than psycopg2 on the Python side.  So it’s kind of a tossup if 
“prepared + pure Python” vs. “non-prepared but very optimized C” is better in 
individual cases.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Michael Bayer

On Feb 6, 2014, at 12:34 PM, Rick Otten rottenwindf...@gmail.com wrote:

 Hello,
 
 I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3.  I have a situation where I 
 need to run the same set of queries millions of times in one session.  I 
 think it would be very helpful from a performance perspective if I could 
 prepare my query ahead of time.  (The data under the queries is not changing 
 significantly during a run.)
 
 Is this possible?  Is SQLAlchemy already doing it behind the scenes for me 
 magically?

the Python DBAPI (see http://www.python.org/dev/peps/pep-0249/) doesn’t have an 
explicit “prepared statements” construct.  what it does have is the 
“executemany()” command, which passes a statement and a list of parameter sets 
to the DBAPI, which can then make the decision to use prepared statements or 
whatever other means it prefers to optimize the approach.

SQLAlchemy documents the use of executemany() most fully in the Core tutorial:

http://docs.sqlalchemy.org/en/rel_0_9/core/tutorial.html#executing-multiple-statements


 I was imagining/hoping I'd find something like this:
 
# prepare the query:
myPreparedQuery = mySession.query(stuff).filter(parameter 
 definitions).prepare()
 
# run the query whenever I need it during my session:
myPreparedQuery.parameters(stuff).fetchall()

prepared statements don’t really apply much to SELECT statements, the 
performance gains from such are marginal as you typically invoke a particular 
SELECT just once within a transaction, and prepared statements don’t 
necessarily carry across transaction or connection boundaries on backends.

There are various performance concerns related to SELECT queries and solving 
them depends much on identifying where a particular performance issue resides.  
 There’s overhead on the side of Python/SQLAlchemy formulating the string 
statement, theres overhead on the side of passing it to the database to be 
parsed (which is by far the most infinitesimal part) there’s overhead on 
retrieving raw rows and columns and there’s python/SQLAlchemy overhead on 
marshaling those rows into Python objects, or particularly ORM objects which 
are very expensive relatively.

For these areas of overhead there are different strategies, some more exotic 
than others, of minimizing overhead, and you’d want to dig into each one 
individually after doing profiling.  For background on profiling, see 
http://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application/1175677#1175677
 .  

I also have a writeup on the performance differences as one moves between core 
and ORM as well as between executemany() and non, which is here: 
http://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly/11769768#11769768
 .




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Michael Bayer

On Feb 6, 2014, at 12:34 PM, Rick Otten rottenwindf...@gmail.com wrote:

 Hello,
 
 I'm using SQLAlchemy 0.8.4 with PostgreSQL 9.3.  I have a situation where I 
 need to run the same set of queries millions of times in one session.  

After sending that, I just read the words “same set of queries millions of 
times in one session”.  That raises a red flag for me.   If it were me and I 
had to pull millions of rows from a database I’d be looking for ways to SELECT 
lots of rows at once, in batches.Millions of individual queries with no 
option for batching suggests that each subsequent query somehow relies upon the 
results of the previous one, which would be a pretty exotic case in itself but 
I still might see if a more advanced feature like window functions can be used 
to help with that (a window function lets you write criteria against the rows 
that have already been returned thus far).

That said, if you truly need to run millions of queries, you might want to 
consider using psycopg2 directly, or at best you’d use only SQLAlchemy Core (no 
ORM) and probably make use of the compiled_cache feature (see 
http://docs.sqlalchemy.org/en/rel_0_9/core/connections.html?highlight=execution_options#sqlalchemy.engine.Connection.execution_options.params.compiled_cache).
 It’s going to be a time consuming operation in any case.








signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Claudio Freire
On Thu, Feb 6, 2014 at 3:08 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 I was imagining/hoping I'd find something like this:

# prepare the query:
myPreparedQuery = mySession.query(stuff).filter(parameter 
 definitions).prepare()

# run the query whenever I need it during my session:
myPreparedQuery.parameters(stuff).fetchall()

 prepared statements don't really apply much to SELECT statements, the 
 performance gains from such are marginal as you typically invoke a particular 
 SELECT just once within a transaction, and prepared statements don't 
 necessarily carry across transaction or connection boundaries on backends.

That's simply false. Because of the following:

 There are various performance concerns related to SELECT queries and solving 
 them depends much on identifying where a particular performance issue 
 resides.   There's overhead on the side of Python/SQLAlchemy formulating the 
 string statement, theres overhead on the side of passing it to the database 
 to be parsed (which is by far the most infinitesimal part)

That part is most definitely not always so infinitesimally small. I
recall one case (granted, one among so many) of a complex query that
was very fast to execute. The query had dozens of joins, so it took
considerable planning time on the database side. Planning took around
300ms, where execution took only 25ms. So, having spotted the
bottleneck, I switched to using prepared statements (it was a query
that was run several times in the session, not millions of times, but
several) with different arguments. The benefit was considerable.

To do that (I was using SQLA 0.5) I had to compile the select object
and generate the query string, I used Text in the bindparams to
replace them with $1, $2, etc... (as postgres likes it), and the
built a PREPARE statement and an EXECUTE one, it was rather
simple, having SQLA generate the query string.

The result was a 10-fold increase in performance.

DBAPI doesn't have direct support for prepared statements, but hacking
them isn't that difficult. You just have to make sure you send the
PREPARE, EXECUTE and the DEALLOCATE (never forget the DEALLOCATE) on
the same connection. Which is easy, just ask SQLAlchemy's session for
its connection and use it.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Jonathan Vanasco
just wondering -- would it be possible to mimic this behavior using a 
custom view for this select ( in postgresql ) and then querying that ? i 
think the query planner might only run on the creation.  

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Claudio Freire
On Thu, Feb 6, 2014 at 4:00 PM, Jonathan Vanasco jonat...@findmeon.com wrote:
 just wondering -- would it be possible to mimic this behavior using a custom
 view for this select ( in postgresql ) and then querying that ? i think the
 query planner might only run on the creation.

I think views don't cache the plan, they're handled as rules.

What you'd need is a function (pg's version of stored procedures).

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Michael Bayer

On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote:

 
 There are various performance concerns related to SELECT queries and solving 
 them depends much on identifying where a particular performance issue 
 resides.   There's overhead on the side of Python/SQLAlchemy formulating the 
 string statement, theres overhead on the side of passing it to the database 
 to be parsed (which is by far the most infinitesimal part)
 
 That part is most definitely not always so infinitesimally small. I
 recall one case (granted, one among so many) of a complex query that
 was very fast to execute. The query had dozens of joins, so it took
 considerable planning time on the database side. Planning took around
 300ms, where execution took only 25ms. So, having spotted the
 bottleneck, I switched to using prepared statements (it was a query
 that was run several times in the session, not millions of times, but
 several) with different arguments. The benefit was considerable.

ideally the database would know how to cache plans based on the string form of 
the statement.  Oracle does this.  It appears that Postrgresql does not unless 
you specifically use a prepared statement.   

 
 DBAPI doesn't have direct support for prepared statements, but hacking
 them isn't that difficult. You just have to make sure you send the
 PREPARE, EXECUTE and the DEALLOCATE (never forget the DEALLOCATE) on
 the same connection. Which is easy, just ask SQLAlchemy's session for
 its connection and use it.

psycopg2 is an extremely mature library and I find it curious that such a 
feature has not been added.  They’ve posted their rationale here: 
http://initd.org/psycopg/articles/2012/10/01/prepared-statements-psycopg/

in any case, prepared statements aren’t part of DBAPI nor built in to psycopg2 
yet so this is out of SQLAlchemy’s realm, thankfully.   If you want to use the 
recipe at that link it would have to be hand-rolled.




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Claudio Freire
On Thu, Feb 6, 2014 at 4:29 PM, Rick Otten rottenwindf...@gmail.com wrote:
 Thanks Claudio - I'll mull over the pros and cons of explicitly managing the
 connections to prepare the statements vs just being patient while the job
 runs vs gains I might achieve elsewhere.

Remember, prepared statements will only help if planning time is a
considerable fraction of execution time. The queries in need of this
would be complex. Try to explain analyze the queries you repeatedly
execute to see whether preparing them is worth the hassle. I don't
remember which version of postgres introduced planning time in the
explain, but if you don't have it, just take the difference between
the reported execution time and what you time from your end, save
network latency that should be equal to what you'd save by preparing.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Claudio Freire
On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote:


 There are various performance concerns related to SELECT queries and 
 solving them depends much on identifying where a particular performance 
 issue resides.   There's overhead on the side of Python/SQLAlchemy 
 formulating the string statement, theres overhead on the side of passing it 
 to the database to be parsed (which is by far the most infinitesimal part)

 That part is most definitely not always so infinitesimally small. I
 recall one case (granted, one among so many) of a complex query that
 was very fast to execute. The query had dozens of joins, so it took
 considerable planning time on the database side. Planning took around
 300ms, where execution took only 25ms. So, having spotted the
 bottleneck, I switched to using prepared statements (it was a query
 that was run several times in the session, not millions of times, but
 several) with different arguments. The benefit was considerable.

 ideally the database would know how to cache plans based on the string form 
 of the statement.  Oracle does this.  It appears that Postrgresql does not 
 unless you specifically use a prepared statement.

Well, it cannot. At least not simplistically, because postgres uses
the literals in the statement (the ones likely to change) to
specialize plan cost, and different values can produce different plans
(depending on stats). For instance, a query with a where clause
deleted=false that selects 99% of a table would do a sequential
scan, whereas deleted=true might use an index (if there was such an
index).

They thus don't cache plans unless specifically asked for (with
prepare), because those plans may very well be inferior (they don't
use value-specific knowledge).

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Michael Bayer

On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 
 There are various performance concerns related to SELECT queries and 
 solving them depends much on identifying where a particular performance 
 issue resides.   There's overhead on the side of Python/SQLAlchemy 
 formulating the string statement, theres overhead on the side of passing 
 it to the database to be parsed (which is by far the most infinitesimal 
 part)
 
 That part is most definitely not always so infinitesimally small. I
 recall one case (granted, one among so many) of a complex query that
 was very fast to execute. The query had dozens of joins, so it took
 considerable planning time on the database side. Planning took around
 300ms, where execution took only 25ms. So, having spotted the
 bottleneck, I switched to using prepared statements (it was a query
 that was run several times in the session, not millions of times, but
 several) with different arguments. The benefit was considerable.
 
 ideally the database would know how to cache plans based on the string form 
 of the statement. Oracle does this.  It appears that Postrgresql does not 
 unless you specifically use a prepared statement.
 
 Well, it cannot. At least not simplistically, because postgres uses
 the literals in the statement (the ones likely to change)
 to
 specialize plan cost, and different values can produce different plans
 (depending on stats). For instance, a query with a where clause
 deleted=false that selects 99% of a table would do a sequential
 scan, whereas deleted=true might use an index (if there was such an
 index).
 
 They thus don't cache plans unless specifically asked for (with
 prepare), because those plans may very well be inferior (they don't
 use value-specific knowledge).

right, but from an API POV, its more tedious since we have to track the handle 
on the outside.

This could be implemented as a cursor event even in conjunction with a custom 
execution option and possibly using compiled_cache as well.




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Claudio Freire
On Thu, Feb 6, 2014 at 4:47 PM, Michael Bayer mike...@zzzcomputing.com wrote:
 On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote:


 There are various performance concerns related to SELECT queries and 
 solving them depends much on identifying where a particular performance 
 issue resides.   There's overhead on the side of Python/SQLAlchemy 
 formulating the string statement, theres overhead on the side of passing 
 it to the database to be parsed (which is by far the most infinitesimal 
 part)

 That part is most definitely not always so infinitesimally small. I
 recall one case (granted, one among so many) of a complex query that
 was very fast to execute. The query had dozens of joins, so it took
 considerable planning time on the database side. Planning took around
 300ms, where execution took only 25ms. So, having spotted the
 bottleneck, I switched to using prepared statements (it was a query
 that was run several times in the session, not millions of times, but
 several) with different arguments. The benefit was considerable.

 ideally the database would know how to cache plans based on the string form 
 of the statement. Oracle does this.  It appears that Postrgresql does not 
 unless you specifically use a prepared statement.

 Well, it cannot. At least not simplistically, because postgres uses
 the literals in the statement (the ones likely to change)
 to
 specialize plan cost, and different values can produce different plans
 (depending on stats). For instance, a query with a where clause
 deleted=false that selects 99% of a table would do a sequential
 scan, whereas deleted=true might use an index (if there was such an
 index).

 They thus don't cache plans unless specifically asked for (with
 prepare), because those plans may very well be inferior (they don't
 use value-specific knowledge).

 right, but from an API POV, its more tedious since we have to track the 
 handle on the outside.

 This could be implemented as a cursor event even in conjunction with a custom 
 execution option and possibly using compiled_cache as well.

That would be tricky. You'd have to maintain a tally of which
connection prepared which query (think the complications of
multi-engine setups), and make sure you deallocate the handle before
returning the connection to the pool. Not to mention coming up with
unique handles.

-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Michael Bayer

On Feb 6, 2014, at 2:58 PM, Claudio Freire klaussfre...@gmail.com wrote:

 On Thu, Feb 6, 2014 at 4:47 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 On Feb 6, 2014, at 2:43 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 On Thu, Feb 6, 2014 at 4:38 PM, Michael Bayer mike...@zzzcomputing.com 
 wrote:
 On Feb 6, 2014, at 1:31 PM, Claudio Freire klaussfre...@gmail.com wrote:
 
 
 There are various performance concerns related to SELECT queries and 
 solving them depends much on identifying where a particular performance 
 issue resides.   There's overhead on the side of Python/SQLAlchemy 
 formulating the string statement, theres overhead on the side of passing 
 it to the database to be parsed (which is by far the most infinitesimal 
 part)
 
 That part is most definitely not always so infinitesimally small. I
 recall one case (granted, one among so many) of a complex query that
 was very fast to execute. The query had dozens of joins, so it took
 considerable planning time on the database side. Planning took around
 300ms, where execution took only 25ms. So, having spotted the
 bottleneck, I switched to using prepared statements (it was a query
 that was run several times in the session, not millions of times, but
 several) with different arguments. The benefit was considerable.
 
 ideally the database would know how to cache plans based on the string 
 form of the statement. Oracle does this.  It appears that Postrgresql does 
 not unless you specifically use a prepared statement.
 
 Well, it cannot. At least not simplistically, because postgres uses
 the literals in the statement (the ones likely to change)
 to
 specialize plan cost, and different values can produce different plans
 (depending on stats). For instance, a query with a where clause
 deleted=false that selects 99% of a table would do a sequential
 scan, whereas deleted=true might use an index (if there was such an
 index).
 
 They thus don't cache plans unless specifically asked for (with
 prepare), because those plans may very well be inferior (they don't
 use value-specific knowledge).
 
 right, but from an API POV, its more tedious since we have to track the 
 handle on the outside.
 
 This could be implemented as a cursor event even in conjunction with a 
 custom execution option and possibly using compiled_cache as well.
 
 That would be tricky. You'd have to maintain a tally of which
 connection prepared which query (think the complications of
 multi-engine setups), and make sure you deallocate the handle before
 returning the connection to the pool. Not to mention coming up with
 unique handles.

you can use a per-connection cache using connection.info, and you can also 
manage the lifecycle of the connection and the cache of items with connection 
and/or pool events as well (e.g. if connection.rollback() clears out prepared 
statements).   lots of functions already do things like this, like keeping 
track of XA handles and such.  it wouldn’t really be hard at all actually, as 
long as the DBAPI itself behaves nicely when lots of handles are allocated.   

the execution_options functionality would then be usable as a way to enable the 
feature on a per-connection or per-statement basis, like:

stmt = 
select([some_table]).execution_options(pg_prepared_statement=True)
conn.execute(stmt)

the string form of stmt could be linked to the generated handle so that any 
number of stmt object’s whose string form matches the target would work.







signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [sqlalchemy] prepared statements

2014-02-06 Thread Jonathan Vanasco


On Thursday, February 6, 2014 2:18:51 PM UTC-5, Klauss wrote:

 I think views don't cache the plan, they're handled as rules. 

 What you'd need is a function (pg's version of stored procedures). 


I had time to look it up; this generally seems correct. Looking at some 
explain syntax, it seems like the plan for join conditions under the view 
are kept -- but you still get hit with a plan onto the view.  i could be 
interpreting this wrong, and this likely has minimal use.

something that might be applicable to the original poster's usage is a 
parallel search.

we have a lightweight read-through cache working with sqlalchemy for 
'simple' queries ( 1-2 kv pairs ).  cache fails are aggregated and then 
tossed into a single select.  

dumb example:

   select * from table where id = 1;
   select * from table where id = 2;
   select * from table where id = 3;
   select * from table where id = 4;
   select * from table where id = 5;

becomes

   select * from table where id in (1,2,3,4,5)

The performance improvement is shockingly significant; i think it's a mix 
of the planning, fetching and fewer roundtrips/connectivity.  if that's an 
option, i'd suggest trying it.


-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sqlalchemy+unsubscr...@googlegroups.com.
To post to this group, send email to sqlalchemy@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/groups/opt_out.


[sqlalchemy] Prepared Statements in Postgresql

2010-01-15 Thread mozillalives
Hello Everyone,

I am new to both sqlalchemy and elixir, but I have been using them for
the past couple of weeks and I really like them. But I have a question
about prepared statements for Postgresql.

For one specific application, I am doing a bunch of inserts
(200,000+). From what I can tell, it looks like these are not prepared
statements. I rewrote the code to issue prepared statements and this
cuts the insertion time in half, but the code is crude. My question's
are:

Is there a way to tell sqlalchemy or the engine (which would be
psycopg2, correct?) to use prepared statements?

I've noticed some opinions online indicating that psycopg2 does not
have prepared statement support (e.g. -
http://www.cerias.purdue.edu/site/blog/post/beware_sql_injections_due_to_missing_prepared_statement_support/)
- can I plug another engine into sqlalchemy that does?

If I can't do any of the above and just need to prepare the statements
manually, is there at least a method in sqlalchemy to properly quote
my data before sending it to postgres?

Thanks,
Phil
-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.




Re: [sqlalchemy] Prepared Statements in Postgresql

2010-01-15 Thread Michael Bayer
mozillalives wrote:
 Hello Everyone,

 I am new to both sqlalchemy and elixir, but I have been using them for
 the past couple of weeks and I really like them. But I have a question
 about prepared statements for Postgresql.

 For one specific application, I am doing a bunch of inserts
 (200,000+). From what I can tell, it looks like these are not prepared
 statements. I rewrote the code to issue prepared statements and this
 cuts the insertion time in half, but the code is crude. My question's
 are:

how did you use prepared statements in Python if you don't know that
psycoopg2 uses prepared statements ?  was this in another language or did
you implement a raw socket connection to your database ?


 Is there a way to tell sqlalchemy or the engine (which would be
 psycopg2, correct?) to use prepared statements?

to efficiently execute the same statement many times, use the
executemany style of execution - the tutorial describes this at
http://www.sqlalchemy.org/docs/05/sqlexpression.html#executing-multiple-statements
.   I don't think that psycopg2 actually uses prepared statements for
this purpose but I am not sure.  The DBAPI executemany() method is used.


 I've noticed some opinions online indicating that psycopg2 does not
 have prepared statement support (e.g. -
 http://www.cerias.purdue.edu/site/blog/post/beware_sql_injections_due_to_missing_prepared_statement_support/)

the comment at the bottom of that post ultimately references a psycopg2
message from 2007 so you'd need to ask the psycopg2 folks for updated
information.  However psycopg2 can do an executemany with great
efficiency as it is using methodologies for which you'd have to ask them,
so if they don't use PG's actual prepared mechanism, its probably
unnecessary.  psycopg2 is an extremely mature and high performing product.


 - can I plug another engine into sqlalchemy that does?

there's the pg8000 engine which may or may not do this.  But its written
in pure python, is not as fast as psycopg2, and is very new and not widely
used since its author doesn't seem to promote it very much (but it is a
very well written library).



 If I can't do any of the above and just need to prepare the statements
 manually, is there at least a method in sqlalchemy to properly quote
 my data before sending it to postgres?

Despite some of the fud-like links mentioned on that blog, SQLAlchemy, as
it says on the website since the day we launched 5 years ago, always uses
bind parameters, in all cases, for all literal values, everywhere.  We do
not and have never quoted anything within SQLA as that is left up to the
services provided by the DBAPI.   DBAPI does not have prepared statement
API.  It has executemany(), for which the underlying implementation may
or may not use prepared statements + server-level bind processing as an
implementation detail.  Psycopg2 handles the quoting in this case. 
cx_oracle, OTOH, uses Oracle's native data binding facilities provided by
OCI.  DBAPI abstracts this detail away.




 Thanks,
 Phil
 --
 You received this message because you are subscribed to the Google Groups
 sqlalchemy group.
 To post to this group, send email to sqlalch...@googlegroups.com.
 To unsubscribe from this group, send email to
 sqlalchemy+unsubscr...@googlegroups.com.
 For more options, visit this group at
 http://groups.google.com/group/sqlalchemy?hl=en.




-- 
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalch...@googlegroups.com.
To unsubscribe from this group, send email to 
sqlalchemy+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en.