Re: [HACKERS] silent data loss with ext4 / all current versions

Tomas Vondra Sun, 29 Nov 2015 06:34:46 -0800

Hi,

On 11/29/2015 02:38 PM, Craig Ringer wrote:

On 27 November 2015 at 21:28, Greg Stark <[email protected]
<mailto:[email protected]>> wrote:


    On Fri, Nov 27, 2015 at 11:17 AM, Tomas Vondra
    <[email protected] <mailto:[email protected]>>
    wrote:
    > I plan to do more power failure testing soon, with more complex test
    > scenarios. I suspect there might be other similar issues (e.g. when we
    > rename a file before a checkpoint and don't fsync the directory - then the
    > rename won't be replayed and will be lost).

    I'm curious how you're doing this testing. The easiest way I can think
    of would be to run a database on an LVM volume and take a large number
    of LVM snapshots very rapidly and then see if the database can start
    up from each snapshot. Bonus points for keeping track of the committed
    transactions before each snaphsot and ensuring they're still there I
    guess.


I've had a few tries at implementing a qemu-based crashtester where it
hard kills the qemu instance at a random point then starts it back up.

I've tried to reproduce the issue by killing a qemu VM, and so far I'vebeen unsuccessful. On bare HW it was easily reproducible (I'd hit theissue 9 out of 10 attempts), so either I'm doing something wrong or qemusomehow interacts with the I/O.

I always got stuck on the validation part - actually ensuring that the
DB state is how we expect. I think I could probably get that right now,
it's been a while.

Weel, I guess we can't really check all the details, but I guess thechecksums make checking the general consistency somewhat simpler. Andthen you have to design the workload in a way that makes the checkeasier - for example remembering the committed values etc.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] silent data loss with ext4 / all current versions

Reply via email to