At 09/04/2016 12:29 PM, Christoph Anton Mitterer wrote:
On Mon, 2016-08-29 at 16:25 +0800, Qu Wenruo wrote:
Send will generate checksum for each command.
What does "command" mean here? Or better said how much data is secured
with one CRC32?

Command is one send command stream, containing all needed info for a operation,
like subvolume command, containing UUID,tranid,
chown command, containing gid/uid,
chmod command, containing mode,
utimes command, containting acm times and a lot of others.

For how much data is secured by 1 CRC32, it depends on the size of the command.

Normal command is quite small, but the exception would be write command.
More than 48K bytes can be secured by one CRC32.



For send stream, it's CRC32 for the whole command.
And this is verified then on the receiving end?

Yes.



Wouldn't it be useful (if this technically possibly) to use the
checksums directly from the sent blocks? That way one could also catch
any errors on the receiving side, that occurred after the checksum from
the receive was verified (e.g. memory errors).

And couldn't one do something similar locally, when btrfs copies
blocks?

At least something like this would seem to me like the most native way:
- One want's checksum protection
- One copies data
- One has already checksums for that data

You can try my dump-send command branch, to verify how send/receive works:
https://github.com/adam900710/btrfs-progs/tree/dump_send_stream

With several try, you could find at least the following reasons:

1) Not all data has checksum
   Only non-inlined data has checksum.
   Inlined data has no checksum (protected by leaf checksum then)

2) Send doesn't following sectorsize unit for non-inlined data
   Just create a 6K file, send the subvolume out, and use dump-send
   to exam it.
   You'll find that, send stream contains exactly 6K data, not 8K
   (2* 4K, which 4K is sectorsize).

   While for data checksum, that are all in sectorsize unit.

3) We need to protect the whole command, not file data only.
   Even write command contains metadata info, like the offset and length
   of the write.
   Since we need to protect the whole command, why not introduce the
   complexity to use TWO CRC32 for meta and data?


=> thus that should be used, as it's the most canonical version of a
   checksum for that data... anything that is newly calculated could
   in the best case just be good, and in the worst add new errors
   (unnoticed),... e.g. when memory is broken and the new checksum is
   calculated over that.

However most bugs are not caused by memory corruption, but humans.
So the command checksum design seems quite good for me though.
It's unified, simple structure and expandable.

Thanks,
Qu



Cheers,
Chris.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to