On Tue, 2008-01-29 at 10:55 +0000, Gregory Stark wrote: > "Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> writes: > > > Sorry, but I don't grok this at all. Why the heck would we care if we have 2 > > parts of the table perfectly clustered, because we started in the middle ? > > Surely our stats collector should recognize such a table as perfectly > > clustered. Does it not ? We are talking about one breakage in the readahead > > logic here, this should only bring the clustered property from 100% to some > > 99.99% depending on table size vs readahead window. > > Well clusteredness is used or could be used for a few different heuristics, > not all of which this would be quite as well satisfied as readahead. But for
Can you give an example? Treating a file as a circular structure does not impose any significant cost that I can see. > It would be great if Postgres picked up a serious statistics geek who could > pipe up in discussions like this with "how about using the Euler-Jacobian > Centroid" or some such thing. If you have any suggestions of what metric to > use and how to calculate the info we need from it that would be great. Agreed. > One suggestion from a long way back was scanning the index and counting how > many times the item pointer moves backward to an earlier block. That would An interesting metric. As you say, we really need a statistician to definitively say what the correct metrics are, and what kind of sampling we need to make good estimates. > still require a full index scan though. And it doesn't help for columns which > aren't indexed though I'm not sure we need this info for columns which aren't > indexed. It's also not clear how to interpolate from that the amount of random > access a given query would perform. I don't think "clusteredness" has any meaning at all in postgres for an unindexed column. I suppose a table could be clustered without an index, but currently there's no way to do that in postgresql. Regards, Jeff Davis ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match