Re: Karel, some followup Q:s on your RAID1C patch

Tinker Mon, 01 Feb 2016 02:13:32 -0800

On 2016-02-01 16:29, Janne Johansson wrote:

2016-01-31 9:24 GMT+01:00 Tinker <[email protected]>:
Q1:
My most important question to you is, the DATA that you CHECKSUM, doyouinclude the SECTOR NUMBER (or other disk location info) of that dataintoyour checksum function's inputs, so if the underlying storage'sstoragemapping table breaks down or by other reason disk WRITE:s go to theWRONG
place, then when READ later on, those READS will FAIL?
Whenever any underlying storage does migrations, it would never changetheOS view of the sector number, all filesystems (raid or not) would breakif
that happened.


Janne (and Karel),

The reason I suggested the location info e.g. sector number to beincluded in the checksum calculation's input data, is that it's a realrisk that a disk's logical-sector-to-physical-sector-mapping tablebreaks down, either because of physical failure, or because of firmwareerrors in disk controller or disk, or because of OS bugs, memory bugs,driver bugs, you name it.

While I agree that within RAID1C the probability ridiculously small,that such a failure would happen so that a certain sector X's locationwould be corrupted, *and* that its checksum in the checksums zone on thedisk would be corrupted in a way symmetric with the first corruption sothat the checksum checks not would catch the problem also, then still ona level of (mathemathical/system) symmetry it does make a sense that thechecksum calculation uses the data location as input also.

ZFS does this to guarantee that the data read is the data that reallybelongs there.

And I guess we're talking about in the range 50-100 extra CPU cycles persector access to deliver this, and no extra storage need, so myspontaneous feel about this is that it probably could be implemented ona "why-not" basis -


What do you say?

Tinker

Re: Karel, some followup Q:s on your RAID1C patch

Reply via email to