On Wed, 2012-12-12 at 17:52 -0500, Greg Smith wrote: > I can take this on, as part of the QA around checksums working as > expected. The result would be a Python program; I don't have quite > enough time to write this in C or re-learn Perl to do it right now. But > this won't be a lot of code. If it's tossed one day as simply a > prototype for something more permanent, I think it's still worth doing now. > > The UI I'm thinking of for what I'm going to call pg_corrupt is a CLI > that asks for: > > -A relation name > -Corruption type (an entry from this list) > -How many blocks to touch > > I'll just loop based on the count, randomly selecting a block each time > and messing with it in that way. > > The randomness seed should be printed as part of the output, so that > it's possible re-create the damage exactly later. If the server doesn't > handle it correctly, we'll want to be able to replicate the condition it > choked on exactly later, just based on the tool's log output. > > Any other requests?
After some thought, I don't see much value in introducing multiple instances of corruption at a time. I would think that the smallest unit of corruption would be the hardest to detect, so by introducing many of them in one pass makes it easier to detect. For example, if we introduce an all-ones page, and also transpose two pages, the all-ones error might be detected even if the transpose error is not being detected properly. And we'd not know that the transpose error was not being detected, because the error appears as soon as it sees the all-ones page. Does it make sense to have a separate executable (pg_corrupt) just for corrupting the data as a test? Or should it be part of a corruption-testing harness (pg_corruptiontester?), that introduces the corruption and then verifies that it's properly detected? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers