Why limit ourselves with Oracle?  How all major proprietary RDBMSs do it.

Thanks for the links. Very interesting.
The DB2 document especially mentions an important point : in order to make their planner/optimizer smarter, they had to make it slower, hence it became crucial to cache the plans. Contrast this with MySQL where using prepared statements gains nothing : the "optimizer" does so little work that it actually doesn't matter.

So, basically, Orcale :
- Parses the query every time (identifies tables, permissions etc) (soft parse) - From that parsed query it looks up a cached plan (the lookup key could then be different depending on the schema etc)
- If not, it must plan the query (hard parse).
Also the Oracle doc mentions that the soft parsing should be avoided by using prepared statements in the application (ie Parse once and Bind lots of times) So, Oracle will redo the parsing + permissions check each time, unless prepared statements are used, in which case it's direct execution.

And DB2 :
Er, the document is not very clear about what it actually does, but the stats look nice ;)

I liked your global prepared statements idea much better. Named the
statements is no problem: DB frontends do that for you anyway
sometimes.

Hm. The "global statements" and the cache would complement each other actually. Why not.

When the user wants to name the statements, he can do so (and perhaps control who can execute what, etc, like with stored procs)
Permission checking overhead will be there at each execution.
Should the plan be cached locally ? (RAM consumption times N bakends...)
Cached per user once permissions have been checked ? (avoids the overhead of rechecking permissions)
What about the search path ?
(I'd force the global statements to use the default search path no matter what, being explicit is better than "why does it stop working ?")

Can the application or the database library name the statements ?
I'm not so sure. This could work for compiled languages (what about when you run several applications ? or several versions of the same application ? do we need a uniqueness of statement names from all developers all over the world ?) Solution : make each application use a different user name, and global prepared statements only visible to the user that created them, perhaps. This conflicts with some desirable features, though. It needs more thinking.

What about non-compiled languages ? It will not be possible to generate a list of statements beforehands... And queries are also constructed dynamically by frameworks such as Rails, which makes naming them impossible, but caching the plans would work well.

So, some situations would benefit from a plan cache,

Frankly, I think you're better off storing them in a table. Shared
memory is a limited resource and you cannot change how much you've

I'd say that unless you have a perverse application that will try all the permutations of column names just to make sure the query is different every time, how many different queries would you want to cache ?... probably less than 1000... so it wouldn't take more than a couple megabytes...

allocated after the server has started. It does mean you'll have to
serialise/deserialise them, but this will be cheaper than replanning,
right?

What would be the overhead of a catalog lookup to get a cached plan for a statement that returns 1 row ? Would the catalog cache make it fast enough ?
        And what about deserialization ?...

I am not too sure that plans and statistical counters should be stored
together...

        Not sure either.

Probably plans should go in one place, and statistics should go to the
stats collector (I know he's not quite ready for this ;)).

        That's the problem...

Hm, a limit on how much memory can be used for plans
(query_plan_cache_size GUC?), and a LRU/LFU expiration
of old plans?

        Now it gets hairy ;)
Yes memory size should be limited. But how to make a LRU cleaner which doesn't create lots of contention ?... Luckily, with a hash having a fixed number of buckets, it is easier (clean a bucket every N seconds for instance).

Perhaps a GUC for controlling query cache should heve three values:
 none -- don't cache any statement
 smart -- use heuristics for deciding whether to cache it
all -- force caching all queries -- for uncommon/statistical/testing purposes.

        I would not volunteer to write that heuristic ;)
Although there would be a very simple solution : if time to parse > some percentage of time to execute then cache. The hairiness is in the plan dependence (or independence) on parameter values, ideally we only want to cache plans that would be good for all parameter values, only the user knows that precisely. Although it could be possible to examine the column histograms...

 (like mysql, /* flags */ SELECT blah )

I don't like the hint flags. They tend to haunt later on (when the database gets smarter, but application forces it to be dumb). I would say a GUC.

I don't like them either... needs a better solution like a flag in PQexecParams, but this would cause lots of trouble, so it's not really possible...



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to