Wes wrote:
On a nightly basis, we shut the database down and do a file system backup.

A short chronology of our database problem:
[snip]
Question:

How can we tell that a database is intact?  In the above example, pg_dumpall
worked on the 8/21 database.  Did it become corrupt between 8/21 and 8/23,
or was it already corrupt and got worse?  Pg_dumpall tells you nothing about
the condition of indexes.  Could a corrupt index corrupt data blocks?

I'm looking at doing a pg_dumpall on a weekly basis so that we have a point
in time where we know we have a recoverable database.  When the database
reaches several hundred GB and over over a billion rows, this isn't a great
solution, and doesn't address the overall database integrity.

Back to the original question...  How can I verify the complete integrity of
a database - especially a very large one where a reload or full index
rebuild could take on the order of days?

You shouldn't have to verify anything. PG's job is to never corrupt your data, and providing your hardware is good it should do so. If you are getting problems almost daily that would suggest a RAM/disk problem to me (sig 11 usually implies RAM). Can't guarantee it's not PG but it's record of reliability is pretty good.


Steps I'd take:
1. Check your version number against the release notes and see if you should upgrade. You don't mention your version, but it's always worth having the last dot-release (7.2.5, 7.3.7, 7.4.5)
2. Schedule time to run memory/disk tests against your hardware. Finding 48 hours might not be easy, but you need to know where you stand.
3. Setup slony or some other replication so I can schedule my downtime.


--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply via email to