On Tue, Apr 2, 2013 at 6:32 AM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2013-04-01 17:56:19 -0500, Jim Nasby wrote: >> On 3/23/13 7:41 AM, Ants Aasma wrote: >> >Yes, having bgwriter do the actual cleaning up seems like a good idea. >> >The whole bgwriter infrastructure will need some serious tuning. There >> >are many things that could be shifted to background if we knew it >> >could keep up, like hint bit setting on dirty buffers being flushed >> >out. But again, we have the issue of having good tests to see where >> >the changes hurt. >> >> I think at some point we need to stop depending on just bgwriter for all >> this stuff. I believe it would be much cleaner if we had separate procs for >> everything we needed (although some synergies might exist; if we wanted to >> set hint bits during write then bgwriter *is* the logical place to put that). >> >> In this case, I don't think keeping stuff on the free list is close enough >> to checkpoints that we'd want bgwriter to handle both. At most we might want >> them to pass some metrics back in forth. > > bgwriter isn't doing checkpoints anymore, there's the checkpointer since 9.2. > > In my personal experience and measurement bgwriter is pretty close to > useless right now. I think - pretty similar to what Amit has done - it > should perform part of a real clock sweep instead of just looking ahead > of the current position without changing usagecounts and the sweep > position and put enough buffers on the freelist to sustain the need till > its next activity phase. I hacked around that one night in a hotel and > got impressive speedups (and quite some breakage) for bigger than s_b > workloads. > > That would reduce quite a bit of pain points: > - fewer different processes/cpus looking at buffer headers ahead in the cycle > - fewer cpus changing usagecounts > - dirty pages are far more likely to be flushed out already when a new > page is needed > - stuff like the relation extension lock which right now frequently have > to search far and wide for new pages while holding the extension lock > exlusively should finish quite a bit faster > > If the freelist lock is separated from the lock protecting the clock > sweep this should get quite a bit of a scalability boost without having > potential unfairness you can have with partitioning the lock or such.
I agree. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers