On Wed, Feb 18, 2009 at 1:34 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> I'm interested to know whether anyone else shares my belief that >> nested loops are the cause of most really bad plans. What usually >> happens to me is that the planner develops some unwarranted optimism >> about the number of rows likely to be generated by the outer side of >> the join and decides that it's not worth sorting the inner side or >> building a hash table or using an index, and that the right thing to >> do is just rescan the inner node on every pass. When the outer side >> returns three or four orders of magnitude more results than expected, >> ka-pow! > > And then there is the other half of the world, who complain because it > *didn't* pick a nestloop for some query that would have run in much less > time if it had.
Well, that's my question: is that really the other half of the world, or is it the other 5% of the world? And how does it happen? In my experience, most bad plans are caused by bad selectivity estimates, and the #1 source of bad selectivity estimates is selectivity estimates for unknown expressions. (Now it appears that Josh is having problems that are caused by overestimating the cost of a page fetch, perhaps due to caching effects. Those are discussed upthread, and I'm still interested to see whether we can arrive at any sort of consensus about what might be a reasonable approach to attacking that problem. My own experience has been that this problem is not quite as bad, because it can throw the cost off by a factor of 5, but not by a factor of 800,000, as in my example of three unknown expressions with a combined selectivity of 0.1.) ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers