On Tue, Sep 15, 2009 at 7:53 AM, Robert Haas <robertmh...@gmail.com> wrote: >>> Consider A IJ B, with >>> the scan over A implemented as an index scan. It seems to me that if >>> the join selectivity is < 1, then assuming there's a choice, we >>> probably want to join A to B and then do the heap fetches against A >>> afterwards. But if the join selectivity is > 1 (consider, for >>> example, a cross join), we probably want to do the heap fetches first. >> >> Hmm, good point. I didn't consider that join selectivity can be > 1. >> >> A more common scenario is that there's an additional filter condition on >> the HeapFetch, with a selectivity < 1. It can then cheaper to perform >> the heap fetches first and only join the remaining rows that satisfy the >> filter condition. > > Well, again, it seems to me that it entirely depends on whether the IJ > increases or decreases the number of rows. You want to do the heap > fetches at the point where there are the fewest of them to do, and you > can't know that a priori. When you start talking about "more common" > scenarios, what you really mean is "more common in the queries I > normally do", and that's not the same as what other people's queries > do. (See, for example, previous discussions on -performance, where it > turns out that my suggested "fix" for a particular kind of planner > problem is the exact opposite of Kevin Grittner's fix for a problem > with the same code; the existing code bounds a certain value from > below at 1 - I suggested raising it to 2, he suggested lowering it to > 0.)
I've been mulling over this problem all week. I haven't gotten all that far, but here are my thoughts. Suppose we're planning some joinrel within a large join nest. We have two partial paths A and B with identical pathkeys. Path A does not involve an index scan, so we know its exact cost. Path B involves an index scan, so a heap fetch will have to be done at some point, but since we can't yet know where the best place to do that heap fetch is, so we can't know the exact cost of B. Basically, the decision we face here is whether to keep both plans A and B or to discard one of them as clearly inferior to the other. As a preliminary observation, this is the logic that is performed by add_path(), which gets called A LOT in planning large join nests, so the performance impact of what happens there needs to be measured carefully. We can, however, put some bounds on the cost of B. We know what the cost of B is disregarding the heap fetch (or heap fetches) that will eventually need to be performed. This cost is also the minimum total cost of B, in the case where some yet-to-be-performed join is estimated to return no rows, and thus the heap fetch is estimated to not actually need to fetch anything. We can also bound the maximum cost of B: it certainly can't be any higher than the cost of doing all the heap fetches immediately above the index fetches. We could probably compute a tighter upper bound by considering various possible positions for the heap fetches at or below the level of the joinrel, but I doubt that it's worth the effort. Instead, what we can do is compare the low-estimate for B to the estimate for A. If it's higher, discard B. Otherwise, compare the high-estimate for B to the estimate for A. If it's lower, discard A. Otherwise, keep both paths. More generally, we can conceive of each path as having low and high estimates, and we can say that for two paths with identical pathkeys, P dominates P' if the high-estimate for P is less than the low-estimate for P'. In this view of the world, we don't actually need to represent the heap-fetches in the path trees during joinrel planning. Instead, we build up a set of possible paths for the whole joinrel, and then at the end, we go back and look at any remaining paths that require heap fetches to be inserted and figure out the best place to put them. The major problem I see with this approach is that it may slow down add_path() too much, but if that turns out to be the case I don't have another idea short of attacking the problem using some kind of heuristics that will probably be less accurate than an analysis of the type described above. Thoughts? ...Robert -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers