On Wed, Jun 5, 2013 at 3:24 PM, Fujii Masao <masao.fu...@gmail.com> wrote: > OTOH, if we use max_wal_size as a hard limit, we can avoid such PANIC > error and long down time. Of course, in this case, once max_wal_size is > reached, we cannot complete any query writing WAL until the checkpoint > has completed and removed old WAL files. During that time, the database > service looks like down from a client, but its down time is shorter than the > PANIC error case. So I'm thinking that some users might want the hard > limit of pg_xlog size.
I wonder if we could tie this in with the recent proposal from the Heroku guys to have a way to slow down WAL writing. Maybe we have several limits: - When limit #1 is passed (or checkpoint_timeout elapses), we start a spread checkpoint. - If it looks like we're going to exceed limit #2 before the checkpoint completes, we attempt to perform the checkpoint more quickly, by reducing the delay between buffer writes. If we actually exceed limit #2, we try to complete the checkpoint as fast as possible. - If it looks like we're going to exceed limit #3 before the checkpoint completes, we start exerting back-pressure on writers by making them wait every time they write WAL, probably in proportion to the number of bytes written. We keep ratcheting up the wait until we've slowed down writers enough that will finish within limit #3. As we reach limit #3, the wait goes to infinity; only read-only operations can proceed until the checkpoint finishes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers