Jeff Janes wrote:
Do you know where this competition is happening? Is it on the platters, or is it in the hard drive write cache (I thought high-end hardware had tagged writes to avoid that), or in the kernel?
Kernel. Linux systems with lots of memory will happily queue up gigabytes of memory in their write cache, only getting serious about writing it out to disk when demanded to by fsync.
This makes sense if we just need to append to a queue. But once the queue is full and we are about to do a backend fsync, might it make sense to do a little more work to look for dups?
One of the paths I'd like to follow is experimenting with both sorting writes by file and looking for duplication in the queues. I think a basic, simple sync spreading approach needs to get finished first through; this sort of thing would then be an optimization on top of it.
-- Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services and Support www.2ndQuadrant.us -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers