On Sat, Apr 24, 2010 at 1:31 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> At least AIUI, the use case for this feature is that you want to avoid >> creating "the same" temporary table over and over again. > > The context that I've seen it come up in is that people don't want to > clutter their functions with create-it-if-it-doesn't-exist logic, > which you have to have given the current behavior of temp tables. > Any performance gain from reduced catalog churn would be gravy.
I think there's a significant contingent on this mailing list who feel that that gravy would be rather tasty and would like very much to enjoy some of it along with their temporary table tetrazzini. > Aside from the DROP problem, I think this implementation proposal > has one other big shortcoming: what are you going to do about > table statistics? In many cases, you really *have* to do an ANALYZE > once you've populated a temp table, if you want to get decent plans > for it. Where will you put those stats? For a first cut, I had thought about ignoring the problem. Now, that may sound stupid, because now if two different backends have very different distributions of data in the table and both do an ANALYZE, one set of statistics will clobber the other set of statistics. On the flip side, for some usage patterns, it might be actually work out to a win. Maybe the data I'm putting in here today is a great deal like the data I put in here yesterday, and planning it with yesterday's statistics doesn't cost enough to be worth a re-ANALYZE. If we don't want to do that, I suppose one option is to create a pg_statistic-like table in the backend's temporary tablespace and put them there; or we could put them into a backend-local hash table. The current setup of pg_statistic is actually somewhat weak for a number of things we might want to do: for example, it might be interesting to gather statistics for the subset of a table for which a particular partial index is predOK. When such an index is available for a particular query, we could use the statistics for that subset of the table instead of the overall statistics for the table, and get better estimates. Or we could even let the user specify predicates which will cause the table to have a different statistical distribution than the table as a whole, and gather statistics for the subset that matches the predicate. One approach would be to make the starelid column able to reference something other than a relation OID, although I don't think that actually helps with the global temp table problem because if we use the real pg_statistic to store the data then we have to arrange to clean it up. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers