On Wed, Jun 29, 2016 at 10:36 AM, Simon Slavin <slav...@bigfraud.org> wrote:
> On 29 Jun 2016, at 5:45pm, Drago, William @ CSG - NARDA-MITEQ 
> <william.dr...@l-3com.com> wrote:
>> Aren't there things like that already built in to the hard disk controllers 
>> (CRC, Reed Solomon, etc.)?
>
> Yes.  But they operate at the level they understand.  For instance ...
>
> A change is made in a field which involves changing just one page of data.  
> In terms of the SQLite file format this would mean that a table page is 
> overwritten -- a one disk sector change.  If SQLite checksums existed then 
> this would mean that the checksum, stored in the table pointer page, would 
> also have to be updated.  Which would mean that another disk sector has to be 
> changed too.
>
> Now suppose there's a big in the storage medium driver which means it 
> occasionally writes the correct data to the wrong sector on disk.  Without 
> checksums this fault would not be noticed: since the wrong sector on disk was 
> updated the wrong checksum on disk would be updated too.  The errors would 
> match.

I think the bigger problem is that delegating this means that you
assume the entire underlying stack is working correctly.  For
instance, the disk may have elaborate error-correction protocols that
are working correctly per sector, but SQLite's pages may span sectors.
Or the underlying disk may be perfect and the filesystem doesn't
provide the same guarantees.  Or someone is running things over NFS.
Having the page checksum embedded in the page at the SQLite level
would provide end-to-end confidence.

Chaining the checksums is a whole different level of assurance.  To
the best of my knowledge _all_ legitimately (1) corrupted databases
I've seen had pages which were individually valid, but not valid when
taken together.  Like an index page referred to a row which wasn't
present in the table page.  This implies that the atomicity guarantees
SQLite relies on were broken at the filesystem or disk level.

-scott

(1) I consider a system where the filesystem is simply broken to not
be legitimate corruption.  For instance, if you get a page of random
which doesn't appear to have every been SQLite data in the first
place.  There's not much SQLite can do about that kind of thing.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to