On Jul 7, 2009, at 9:31 AM, "Kevin Grittner" <kevin.gritt...@wicourts.gov > wrote:

Robert Haas <robertmh...@gmail.com> wrote:

I'm interested in hearing from anyone who has practical experience
with tuning these variables, or any ideas on what we should test to
get a better idea as to how to set them.

I don't remember any clear resolution to the wild variations in plan
time mentioned here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00743.php

I think it would be prudent to try to figure out why small changes in
the query caused the large changes in the plan times Andres was
seeing.  Has anyone else ever seen such behavior?  Can we get
examples?  (It should be enough to get the statistics and the schema,
since this is about planning time, not run time.)

Well, there's not really enough information there to figure out specifically what was happening, but from 10,000 feet, join_collapse_limit and from_collapse_limit constrain the join order. If the estimates are all accurate, setting them to a value < infinity will either leave the plans unchanged or make them worse. If it's making them better, then the estimates are off and the join order constraint happens to be preventing the planner from considering the cases what really hurts you. But that's mostly luck.

My own experience is that when we investigate a complaint about a
query not performing to user or application programmer expectations,
we have sometimes found that boosting these values has helped.  We
boost them overall (in postgresql.conf) without ever having seen a
downside.  We currently have geqo disabled and set both collapse
limits to 20.  We should probably just set them both to several
hundred and not wait until some query with more than 20 tables
performs badly, but I'm not sure we have any of those yet.

In short, my experience is that when setting these higher has made any
difference at all, it has always generated a plan that saved more time
than the extra planning required.  Well, I'd bet that there has been
an increase in the plan time of some queries which wound up with the
same plan anyway, but the difference has never been noticeable; the
net
effect has been a plus for us.

You have a big dataset AIUI so the right values for you might be too high for some people with, say, OLTP workloads.

I guess the question is whether there is anyone who has had a contrary
experience.  (There must have been some benchmarks to justify adding
geqo at some point?)

GEQO or something like it is certainly needed for very large planning problems. The non-GEQO planner takes exponential time in the size of the problem, so at some point that's going to get ugly. But triggering it at the level we do now seems unnecessarily pessimistic about what constitutes too much planning.

...Robert

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to