On 12/18/12 3:17 AM, Simon Riggs wrote:
Clearly part of the response could involve pg_dump on the damaged
structure, at some point.

This is the main thing I wanted to try out more, once I have a decent corruption generation tool. If you've corrupted a single record but can still pg_dump the remainder, that seems the best we can do to help people recover from that. Providing some documentation on how to figure out what rows are in that block, presumably by using the contrib inspection tools, would be helpful too.

Indexes are a good case, because we can/should report the block error, mark the
index as invalid and then hint that it should be rebuilt.

Marking a whole index invalid because there's one bad entry has enough downsides that I'm not sure how much we'd want to automate that. Not having that index available could easily result in an effectively down system due to low performance. The choices are uglier if it's backing a unique constraint.

In general, what I hope people will be able to do is switch over to their standby server, and then investigate further. I think it's unlikely that people willing to pay for block checksums will only have one server. Having some way to nail down if the same block is bad on a given standby seems like a useful interface we should offer, and it shouldn't take too much work. Ideally you won't find the same corruption there. I'd like a way to check the entirety of a standby for checksum issues, ideally run right after it becomes current. It seems the most likely way to see corruption on one of those is to replicate a corrupt block.

There is no good way to make the poor soul who has no standby server happy here. You're just choosing between bad alternatives. The first block error is often just that--the first one, to be joined by others soon afterward. My experience at how drives fail says the second error is a lot more likely after you've seen one.

--
Greg Smith   2ndQuadrant US    g...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to