[From a thread about WAL checkpoints...] On Wed, Mar 30, 2011 at 6:28 AM, Richard Hipp <[email protected]> wrote: > On Wed, Mar 30, 2011 at 8:46 AM, Mikael <[email protected]> wrote: >> Therefore I now wish to check with you, is SQLite implemented to somehow >> make checkpoints not be able to corrupt the main database file ever? (Say a >> checkpoint went halfway and then the OS ran out of memory, killed all >> processes and rebooted.) > > Yes. The WAL is not cleared or deleted until after the checkpoint > completes. If there is an OS crash or power loss in the middle of a > checkpoint, then the checkpoint will automatically start over again the next > time the database file is opened. > > This is where correct functioning of fsync() is critical. We use fsync() as > a write barrier operation. We require that all I/O operations that occur > before the fsync() (writes to the database file) must complete prior to any > I/O operations that occur after the fsync() (clearing the WAL file). If > fsync() is working correctly, then everything is safe. Unfortunately, on > many consumer-grade disk drives nowadays, fsync() does not work correctly > :-( But you say that fsync is working correctly on your system, so it > shouldn't be a problem for you.
Aha! I've been wondering about this (and too lazy/busy to really dig into the code). This implies that if you have a platform which is known to have periodic corruptions of the "fsync lied" sort, then using WAL narrows but does not entirely close the window. Since it relies on fsync working at the time of checkpoint, you can tune the frequency to reduce occurance. I've seen some other storage systems work around this (to some extent) by keeping overlap between merges of the log files. By tuning that overlap to represent a reasonable timeframe (like 15 minutes), if the fsync fails chances are the _previous_ fsync succeeded, simply by virtue of the system having been up for that 15 minutes, giving it more time to have written out those blocks. Unfortunately, in the cases I'm aware of this worked by virtue of the system using a collection of immutable files to represent the database, so it was easy to keep the overlapped data and figure things out at startup from the manifest. I can't offhand think of a way to apply it to update-in-place on the main SQLite database, but maybe someone else is smarter than me? -scott _______________________________________________ sqlite-users mailing list [email protected] http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

