Gavin Sherry <[EMAIL PROTECTED]> writes: > It occured to me on the plane home that now that CLUSTER is fixed we may > be able to put pg_index.indisclustered to use. If CLUSTER was to set > indisclustered to true when it clusters a heap according to the given > index, we could speed up sequantial scans.
AFAICT you're assuming that the table is *exactly* ordered by the clustered attribute. While this is true at the instant CLUSTER completes, the exact ordering will be destroyed by the first insert or update :-(. I can't see much value in creating a whole new scan type that's only usable on a perfectly-clustered table. The existing approach to making the planner smart about clustered tables is to compute a physical-vs-logical-order-correlation statistic and use that to adjust the estimated cost of indexscans. I believe this is a more robust approach than considering a table to be "clustered" or "not clustered", since it can deal with the gradual degradation of clustered order over time. However, I will not make any great claims for the specific equations currently used for this purpose --- they're surely in need of improvement. Feel free to take a look and see if you have any ideas. The collection of the statistic is in commands/analyze.c and the use of it is in optimizer/path/costsize.c. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/users-lounge/docs/faq.html