Jan Wieck <[EMAIL PROTECTED]> writes: > What happens instead is that vacuum not only evicts the whole buffer > cache by forcing all blocks of said table and its indexes in, it also > dirties a substantial amount of that and leaves the dirt to be cleaned > up by all the other backends.
[ thinks about that... ] Yeah, I believe you're right, because (plain) vacuum just does WriteBuffer() for any page that it modifies, which only marks the page dirty in buffer cache. It never does anything to force those pages to be written out to the kernel. So, if you have a large buffer cache, a lot of write work will be left over to be picked up by other backends. I think that pre-WAL the system used to handle this stuff differently, in a way that made it more likely that VACUUM would issue its own writes. But optimizations intended to improve the behavior for non-VACUUM cases have made this not so good for VACUUM. I like your idea of penalizing VACUUM-read blocks when they go back into the freelist. This seems only a partial solution though, since it doesn't directly ensure that VACUUM rather than some other process will issue the write kernel call for the dirtied page. Maybe we should resurrect a version of WriteBuffer() that forces an immediate kernel write, and use that in VACUUM. Also, we probably need something similar for seqscan-read blocks, but with an intermediate priority (can we insert them to the middle of the freelist?) regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings