On 21 January 2014 23:01, Jeff Janes <jeff.ja...@gmail.com> wrote: > On Tue, Jan 21, 2014 at 9:35 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> >> Simon Riggs <si...@2ndquadrant.com> writes: >> > On 6 June 2013 16:00, Heikki Linnakangas <hlinnakan...@vmware.com> >> > wrote: >> >> The current situation is that if you run out of disk space while >> >> writing >> >> WAL, you get a PANIC, and the server shuts down. That's awful. >> >> > I don't see we need to prevent WAL insertions when the disk fills. We >> > still have the whole of wal_buffers to use up. When that is full, we >> > will prevent further WAL insertions because we will be holding the >> > WALwritelock to clear more space. So the rest of the system will lock >> > up nicely, like we want, apart from read-only transactions. >> >> I'm not sure that "all writing transactions lock up hard" is really so >> much better than the current behavior. >> >> My preference would be that we simply start failing writes with ERRORs >> rather than PANICs. I'm not real sure ATM why this has to be a PANIC >> condition. Probably the cause is that it's being done inside a critical >> section, but could we move that? > > > My understanding is that if it runs out of buffer space while in an > XLogInsert, it will be holding one or more buffer content locks exclusively, > and unless it can complete the xlog (or scrounge up the info to return that > buffer to its previous state), it can never release that lock. There might > be other paths were it could get by with an ERROR, but if no one can write > xlog anymore, all of those paths must quickly converge to the one that > cannot simply ERROR.
Agreed. You don't say it but I presume you intend to point out that such long-lived contention could easily have a knock on effect to other read-only statements. I'm pretty sure other databases work the same way. Our choice are 1. Waiting 2. Abort transactions 3. Some kind of release-locks-then-wait-and-retry (3) is a step too far for me, even though it is easier than you say since we write WAL before changing the data block so a failure to insert WAL could just result in a temporary drop lock, sleep and retry. I would go for (1) waiting for up to checkpoint_timeout then (2), if we think that is a problem. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers