On Sat, May 9, 2009 at 7:00 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I wrote: >> ... So it appears to me that instead of taking an average-case correction >> as is done in this patch and the old coding, we have to explicitly model >> the matched-tuple and unmatched-tuple cases separately. > > I've applied the attached patch that does things this way. I did not do > anything about improving the detailed modeling of hash-bucket searching > as Robert suggested in some later messages. I think that's probably > worth looking at, but it's a second-order consideration --- this patch > already seems to bring the estimates for semi/antijoins much closer > to reality.
I'll take a look at this when I get a chance, but I'm just playing with test cases, so I share your hope that Kevin (or someone else with complex queries against real data) will test it out. > I am a bit concerned about the extra time spent on repeated selectivity > estimates. It might not matter too much since it's only done for semi > and anti joins which aren't that common. It would be good though if > someone who has a lot of such joins could test CVS HEAD and see if > performance has gotten worse (Kevin?). We could refactor things to > reduce the duplication of effort but I'd prefer to leave that sort of > thing to 8.5. Agreed. I was worried about that when I wrote the emails to which you refer above, but I don't know how else to get good estimates for all the relevant cases. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers