On Sat, Sep 12, 2009 at 6:14 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> On Sep 6, 2009, at 10:45 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>> ... But now that we have a plan for a less obviously broken costing >>> approach, maybe we should open the floodgates and allow >>> materialization >>> to be considered for any inner path that doesn't materialize itself >>> already > >> Maybe. I think some experimentation will be required. We also have >> to be aware of effects on planning time; match_unsorted_outer() is, >> AIR, a significant part of the CPU cost of planning large join problems. > > I've committed some changes pursuant to this discussion. It may be that > match_unsorted_outer gets a bit slower, but I'm not too worried about > that. My experience is that the code that tries different mergejoin > options eats way more cycles than the nestloop code does.
One problem with the current implementation of cost_rescan() is that it ignores caching effects. It seems to be faster to rescan a materialize node than it is to rescan a seqscan of a table, even if there are no restriction clauses, presumably because you get to skip tuple visibility checks and maybe some other overhead, too. But cost_rescan() thinks that rescanning the table will require rereading the whole thing from disk, which isn't right either - it probably ought to factor in effective_cache_size much as the estimates for iterated index scans do. I'm not sure how many real problems this is going to create. Another potential problem is that materializing a whole-table seqscan to avoid repeating the tuple visibility checks may be a win in some strict sense, but there are externalities: it's also going to use a lot more memory/disk than just rescanning the table. ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers