On Mon, Feb 6, 2012 at 2:27 PM, Roger Binns <rog...@rogerbinns.com> wrote:
> On 06/02/12 11:35, Nico Williams wrote:
>> Indeed, but if you'd do integrity protection at the application layer
>> then surely you'd have have a backup/restore strategy to deal with
>> lower-layer corruption.
>
> Only if you know about that corruption! It is perfectly possible for stray
> zeroes to be written into the btrees in such a way that they are still
> considered valid, but you have effectively amputated a row.

No, you can do something about that: you could store a hash of the XOR
of the hashes of all the rows (per-table).  To check integrity simply
get all the rows, compute the running XOR of hashes, then when you're
done hash the result and then check that against the stored value for
the whole DB.

(You could also rebuild indexes after checking DB integrity.)

>> However, if you're going to have SQLite3 do it
>
> Note that SQLite already has a way of reserving space on a per page basis.
>  It is currently used by the encryption extension.  It is possible to
> store a checksum there instead which would still be readable but not
> writeable by older versions of SQLite.

Interesting.

> Since the btree pages would also have checksums things should work in most
> scenarios.  Any solution would be better than the current no solution,
> especially as checksums let you discover corruption early.

Yes.

>> Well, one can hope.  (I don't keep up with btrfs; perhaps it will
>> mature and become common on Android devices.)
>
> Approximately zero percent of SQLite databases will be on btrfs for the
> foreseeable future, even if you and I start using btrfs.  Android 3 and
> above do support encrypting the filesystem which effectively gives you a
> lot of the checksumming, although filesystem implementations tend to care
> far more about metadata than file contents.

Encryption is not enough.  You really need block pointers to carry the
block checksum/hash/MAC/integrity tag.

> Windows 8 server will be coming with a filesystem that does integrity
> checking.  However again approximately zero percent of SQLite databases
> will be stored on that filesystem for the foreseeable future.

Right.

> SQLite having its own checksumming will solve the integrity issues today,
> for hundreds of millions of users, especially when developers consider
> data very important.  It will also solve some of the corruption issue
> finger pointing - if a page has a correct checksum but SQLite considers
> the contents corrupt then the most likely explanation is a bug in SQLite,
> while an invalid checksum points to the contents being different than what
> SQLite wrote hence points fingers at the operating system and storage.

Agreed.  For this, again, I'd prefer page-level hashing, with hash
values stored with page pointers.  This way the overhead of integrity
checking would be paid once per-I/O.

Nico
--
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to