Tom Lane <t...@sss.pgh.pa.us> wrote: > "Kevin Grittner" <kevin.gritt...@wicourts.gov> writes: >> Simon Riggs <si...@2ndquadrant.com> wrote: >>> This topic has been discussed many times, yet I have never seen >>> an assessment that explains WHY we would want to do index-only >>> scans. > >> In databases with this feature, it's not too unusual for a query >> which uses just an index to run one or more orders of magnitude >> faster than a query which has to randomly access the heap for >> each index entry. That seems like enough evidence of its >> possible value in PostgreSQL to proceed to the point where >> benchmarks become possible. I'm assuming that, like all other >> features added as performance optimizations, it won't be >> committed until there are benchmarks showing the net benefit. > >> As a thought experiment, picture the relative costs of scanning a >> portion of an index in index sequence, and being done, versus >> scanning a portion of an index in index sequence and jumping to a >> random heap access for each index entry as you go. > > It's already the case that we'll flip over to a bitmap indexscan, > and thus get rid of most/all of the "random" page accesses, in > situations where this is likely to be a big win. Pointing to the > performance difference in databases that don't do that is > therefore not too convincing. Sure. Of course, if you're only accessing twenty thousand rows from a table containing fifty million rows, bitmap index scans could come out pretty close to random access times; but on the whole I agree that the scale of benefit in PostgreSQL won't tend to match what people see in other products. Note that my words were "enough evidence of its possible value in PostgreSQL to proceed to the point where benchmarks become possible." In particular, we might want to somehow try to make clear to people that the very wide indexes they are accustomed to creating to allow this optimization in other products might be inefficient compared to creating several one-column indexes which would enable bitmap logical operations. > I'm inclined to agree that index-only scans might be worth the > amount of work that's involved So we agree there. > ... but I share Simon's desire to see some proof before anything > gets committed. And we agree there. In fact, I can't think of anyone in the community who doesn't want to see that for *any* purported performance enhancement. My overall gut feel is that there will be some circumstances where the "covering index" optmization is much faster, and some where people expect it to be, but it isn't. The trickiest part of this might be developing a costing model which allows us to make the right choice most of the time. And even if we get it perfect, we can expect questions about why the covering index wasn't used, and requests for hints so they can force it. :-( -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers