On 6/30/11 2:00 AM, Simon Riggs wrote: >>> Manual (or scripted) intervention is always necessary if you reach disk >>> >> 100% full. >> > >> > Wow, that's a pretty crappy failure mode... but I don't think we need >> > to fix it just on account of this patch. It would be nice to fix, of >> > course. > How is that different to running out of space in the main database? > > If I try to pour a pint of milk into a small cup, I don't blame the cup.
I have to agree with Simon here. ;-) We can do some things to make this easier for administrators, but there's no way to "solve" the problem. And the things we could do would have to be advanced optional modes which aren't on by default, so they wouldn't really help the DBA with poor planning skills. Here's my suggestions: 1) Have a utility (pg_archivecleanup?) which checks if we have more than a specific settings's worth of archive_logs, and breaks replication and deletes the archive logs if we hit that number. This would also require some way for the standby to stop replicating *without* becoming a standalone server, which I don't think we currently have. 2) Have a setting where, regardless of standby_delay settings, the standby will interrupt any running queries and start applying logs as fast as possible if it hits a certain number of unapplied archive logs. Of course, given the issues we had with standby_delay, I'm not sure I want to complicate it further. I think we've already fixed the biggest issue in 9.1, since we now have a limit on the number of WALs the master will keep if archiving is failing ... yes? That's the only big *avoidable* failure mode we have, where a failing standby effectively shuts down the master. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers