Sean Chittenden <[EMAIL PROTECTED]> writes: >> If you CLUSTER on an index and then ANALYSE, you get a correlation of >> 1.0 (== optimum) for the first column of the index.
> Correlating of what to what? Of data to nearby data? Of data to > related data (ie, multi-column index?)? Of related data to pages on > disk? Not 100% sure in what context you're using the word > correlation... The correlation is between index order and heap order --- that is, are the tuples in the table physically in the same order as the index? The better the correlation, the fewer heap-page reads it will take to do an index scan. Note it is possible to measure correlation without regard to whether there actually is any index; ANALYZE is simply looking to see whether the values appear in increasing order according to the datatype's default sort operator. One problem we have is extrapolating from the single-column correlation stats computed by ANALYZE to appropriate info for multi-column indexes. It might be that the only reasonable fix for this is for ANALYZE to compute multi-column stats too when multi-column indexes are present. People are used to the assumption that you don't need to re-ANALYZE after creating a new index, but maybe we'll have to give that up. > But that value will degrade after time and at what rate? Does ANALYZE > maintain that value so that it's kept acurrate? You keep it up to date by ANALYZE-ing at suitable intervals. It's no different from any other statistic. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster