James Rogers kirjutas N, 02.10.2003 kell 20:50: > To give a real world example, a standard query on one of our tables that > has not been CLUSTER-ed recently (i.e. within the last several days) > generates an average of ~2,000 cache misses. Recently CLUSTER-ed, it > generates ~0 cache misses on average. Needless to say, one is *much* > faster than the other.
So what you really need is the CLUSTER command to leave pages half-empty and the tuple placement logic on inserts/updates to place new tuples near the place where they would be placed by CLUSTER. I.e. the code that does actual inserting should be aware of CLUSTERing. I guess that similar behaviour (half-empty pages, or even each second page empty which is better as it creates less dirty buffers) could also significantly speed up updates on huge number of tuples, as then code could always select a place near the old one and thus avoid needless head-movements between reading and writing areas. > In my case, not only does CLUSTER-ing increase the number of concurrent > queries possible without disk thrashing by an integer factor, but the > number of buffers touched on a query that generates a cache misses is > greatly reduced as well. The problem is that CLUSTER-ing is costly and > index-organizing some of the tables would reduce the buffer needs, since > the index tuple in these cases are almost as large as the heap tuples > they reference. True, but my above suggestion would be much easier to implement near-term. It seems to be a nice incremental improvement just needs to touch places: 1. selecting where new tuples go : * updated ones go near old ones if not clustered and near the place CLUSTER would place them if clustered. * inserted ones go to the less than half-empty pages if not clustered and near the place CLUSTER would place them if clustered. 2. making reorganization code (CLUSTER and VACUUM FULL) to leave space in pages for clustered updates/inserts. the "half" above could of course mean anything from 10% to 95% depending on access patterns. --------------------- Hannu ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org