The MAJOR benefit of Microsoft's approach is that it works on existing application,

        Yes, that is a nice benefit !
Is there a way to turn it on/off ? Or is it smart enough to only cache plans for cases where it is relevant ? For instance, I absolutely want some queries to be planned according to real parameters (makes huge difference on some search queries, as expected), whereas most simple queries like the proverbial select by ID etc could be cached without problems...

and, most importantly makes NO assumptions on the "volatile" server state. A few cases where the Microsoft solution works, while yours will fail is:

    * Server restart and assorted like failover (you need to redo a
      global prepare).
    * Cleanup and instantiation of a prepared statement.

Hehe, actually, mine does work after restart since the statements are stored in a database-specific system catalog which is persistent. Actually, what I store is not the result of PREPARE (a plan) but the text of the SQL query "PREPARE foo.....", that is I just cut the GLOBAL from "GLOBAL PREPARE" and store the rest. The actual PREPARE is realized by each connection when it encounters an EXECUTE request and doesn't find the cached plan. It is actually extremely simple ;) did you expect a fancy shared memory cache (ahem...) ? No, no, it's very basic. This way, if a table was dropped and recreated, or whatever other stuff that can invalidate a plan since the GLOBAL PREPARE was issued, no problem, since there was no global stored plan anyway, just some SQL text. Also if a needed table was dropped, the user will get the same error message as he would have got issuing a PREPARE for the associated SQL query string. The overhead of each connection doing its own PREPARE is negligible, since, if you use that feature, you intend to issue this query many, many times during the life of the persistent connection.

What you are doing for a global query cache is already in consideration and having plan invalidation mechanism on schema changes or, maybe, statistic updates was a step into that direction. You code mostly contributed the other parts already.

        As I said it is much simpler than that : I store no plans ;)
        Of course this means it only works with persistent connections.

Another considerations is whether most task are getting CPU bound or IO bound. A better, per query, plan might reduce IO load due to better use of statistics on that single case, while for CPU bound it is very nice to reduce the planning overhead significantly.

Well, if it is IO bound, then this thing is useless. However, since the purpose is to optimize often-used, simple queries, the likes of which abound in web applications, then it is relevant... because, if this kind of simple selects become IO bound, and you have a few on each page, you're really in trouble...

Another possible implementation would be to use a connection pooler which, when opening a new connection, can be configured to send a SQL script containing all the PREPARE statements.
This is, IMHO, an application side feature that might be a good addition to PHP and other languages that provide the "persistent connection" feature.

On second thought, if it is not in Postgres, I'd rather put this in the connection pooler, because this way it can be used by different applications. But then you have to use a connection pooler. Also, by putting it in Postgres, statements are only prepared as needed, whereas the pooler would have to issue a lot of PREPAREs at connection startup, making new connection startup slower.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to