Re: [HACKERS] Cached Query Plans (was: global prepared statements)

PFC Sun, 13 Apr 2008 17:48:02 -0700

Why limit ourselves with Oracle?  How all major proprietary RDBMSs do it.


Thanks for the links. Very interesting.

The DB2 document especially mentions an important point : in order to maketheir planner/optimizer smarter, they had to make it slower, hence itbecame crucial to cache the plans.Contrast this with MySQL where using prepared statements gains nothing :the "optimizer" does so little work that it actually doesn't matter.


So, basically, Orcale :

- Parses the query every time (identifies tables, permissions etc) (softparse)- From that parsed query it looks up a cached plan (the lookup key couldthen be different depending on the schema etc)

- If not, it must plan the query (hard parse).

Also the Oracle doc mentions that the soft parsing should be avoided byusing prepared statements in the application (ie Parse once and Bind lotsof times)So, Oracle will redo the parsing + permissions check each time, unlessprepared statements are used, in which case it's direct execution.


And DB2 :

Er, the document is not very clear about what it actually does, but thestats look nice ;)

I liked your global prepared statements idea much better. Named the
statements is no problem: DB frontends do that for you anyway
sometimes.

Hm. The "global statements" and the cache would complement each otheractually. Why not.

When the user wants to name the statements, he can do so (and perhapscontrol who can execute what, etc, like with stored procs)

Permission checking overhead will be there at each execution.
Should the plan be cached locally ? (RAM consumption times N bakends...)

Cached per user once permissions have been checked ? (avoids the overheadof rechecking permissions)

What about the search path ?

(I'd force the global statements to use the default search path no matterwhat, being explicit is better than "why does it stop working ?")


Can the application or the database library name the statements ?

I'm not so sure. This could work for compiled languages (what about whenyou run several applications ? or several versions of the same application? do we need a uniqueness of statement names from all developers all overthe world ?) Solution : make each application use a different user name,and global prepared statements only visible to the user that created them,perhaps. This conflicts with some desirable features, though. It needsmore thinking.

What about non-compiled languages ? It will not be possible to generate alist of statements beforehands... And queries are also constructeddynamically by frameworks such as Rails, which makes naming themimpossible, but caching the plans would work well.


So, some situations would benefit from a plan cache,

Frankly, I think you're better off storing them in a table. Shared
memory is a limited resource and you cannot change how much you've

I'd say that unless you have a perverse application that will try all thepermutations of column names just to make sure the query is differentevery time, how many different queries would you want to cache ?...probably less than 1000... so it wouldn't take more than a couplemegabytes...

allocated after the server has started. It does mean you'll have to
serialise/deserialise them, but this will be cheaper than replanning,
right?

What would be the overhead of a catalog lookup to get a cached plan for astatement that returns 1 row ? Would the catalog cache make it fast enough?

        And what about deserialization ?...

I am not too sure that plans and statistical counters should be stored
together...


        Not sure either.

Probably plans should go in one place, and statistics should go to the
stats collector (I know he's not quite ready for this ;)).


        That's the problem...

Hm, a limit on how much memory can be used for plans
(query_plan_cache_size GUC?), and a LRU/LFU expiration
of old plans?


        Now it gets hairy ;)

Yes memory size should be limited. But how to make a LRU cleaner whichdoesn't create lots of contention ?... Luckily, with a hash having a fixednumber of buckets, it is easier (clean a bucket every N seconds forinstance).

Perhaps a GUC for controlling query cache should heve three values:
 none -- don't cache any statement
 smart -- use heuristics for deciding whether to cache it
all -- force caching all queries -- for uncommon/statistical/testingpurposes.


        I would not volunteer to write that heuristic ;)

Although there would be a very simple solution : if time to parse > somepercentage of time to execute then cache.The hairiness is in the plan dependence (or independence) on parametervalues, ideally we only want to cache plans that would be good for allparameter values, only the user knows that precisely. Although it could bepossible to examine the column histograms...

 (like mysql, /* flags */ SELECT blah )
I don't like the hint flags. They tend to haunt later on (when thedatabase gets smarter, but application forces it to be dumb). I wouldsay a GUC.

I don't like them either... needs a better solution like a flag inPQexecParams, but this would cause lots of trouble, so it's not reallypossible...




--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Cached Query Plans (was: global prepared statements)

Reply via email to