On Tue, 16 Feb 2010 15:22:00 +0100, Greg Stark <gsst...@mit.edu> wrote:

There's a second problem though. We don't actually know how long any
given query is going to take to plan or execute. We could just
remember how long it took to plan and execute last time or how long it
took to plan last time and the average execution time since we cached
that plan. Perhaps we should track the stddev of the execution plan,
or the max execution time of the plan? Ie there are still unanswered
questions about the precise heuristic to use but I bet we can come up
with something reasonable.

This could be an occasion to implement plan caching...

Web 2.0 = AJAX means less need for heavy webpage reloads with (usually) lots of queries, and more small simple queries like selects returning 1 or a few rows every time the user clicks on something.

See benchmark here : (PG 8.4.2, MYSQL 5.1.37)
http://purity.bobfuck.net/posts/postgres/2010-02-Prep/

If prepared statements are used, MySQL is not faster for "small, simple selects"... However, when not using prepared statements, most of the postmaster CPU time is spent parsing & planning.

Problem with prepared statements is they're a chore to use in web apps, especially PHP, since after grabbing a connection from the pool, you don't know if it has prepared plans in it or not.

The postmaster could keep a hash of already prepared plans, using the $-parameterized query as a hash key, and when it receives parse+bind message, look up in this cache and fetch plans for the query, avoiding planning entirely.

This could be done by the connection pooler too, but it doesn't have the information to decide wether it's wise to cache a plan or not.

Of course all the subtility is to determine if the plan is reusable with other parameters...

- after planning and executing the query, only cache it if the plan time is a significant part of the query time (as said previously).
- only simple queries should be automatically cached like this
- perhaps some measure of "plan volatility" ? For the examples I give in the link above, it's quite easy at least in 2 of the cases : searching UNIQUE columns can't return more than 1 row, so volatility is zero. It only depends on the table size.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to