Christoph Anton Mitterer posted on Wed, 16 Dec 2015 22:59:01 +0100 as excerpted:
> On Wed, 2015-12-09 at 16:36 +0000, Duncan wrote: >> But... as I've pointed out in other replies, in many cases including >> this specific one (bittorrent), applications have already had to >> develop their own integrity management features > Well let's move discussion upon that into the "dear developers, can we > have notdatacow + checksumming, plz?" where I showed in one of the more > recent threads that bittorrent seems rather to be the only thing which > does use that per default... while on the VM image front, nothing seems > to support it, and on the DB front, some support it, but don't use it > per default. > >> In the bittorrent case specifically, torrent chunks are already >> checksummed, and if they don't verify upon download, the chunk is >> thrown away and redownloaded. > I'm not a bittorrent expert, because I don't use it, but that sounds to > be more like the edonkey model, where - while there are checksums - > these are only used until the download completes. Then you have the > complete file, any checksum info thrown away, and the file again being > "at risk" (i.e. not checksum protected). [I'm breaking this into smaller replies again.] Just to mention here, that I said "integrity management features", which includes more than checksumming. As Austin Hemmelgarn has been pointing out, DBs and some VMs do COW, some DBs do checksumming or at least have that option, and both VMs and DBs generally do at least some level of consistency checking as they load. Those are all "integrity management features" at some level. As for bittorrent, I /think/ the checksums are in the torrent files themselves (and if I'm not mistaken, much as git, the chunks within the file are actually IDed by checksum, not specific position, so as long as the torrent is active, uploading or downloading, these will by definition be retained). As long as those are retained, the checksums should be retained. And ideally, people will continue to torrent the files long after they've finished downloading them, in which case they'll still need the torrent files themselves, along with the checksums info. And for longer term storage, people really should be copying/moving their torrented files elsewhere, in such a way that they either eliminate the fragmentation if the files weren't nocowed, or eliminate the nocow attribute and get them checksum-protected as normal for files not intended to be constantly randomly rewritten, which will be the case once they're no longer being actively downloaded. Of course that's at the slightly technically oriented user level, but then, the whole nocow thing, or even caring about checksums and longer term file integrity in the first place, is also technically oriented user level. Normal users will just download without worrying about the nocow in the first place, and perhaps wonder why the disk is thrashing so, but not be inclined to do anything about it except perhaps switch back to their old filesystem, where it was faster and the disk didn't sound as bad. In doing so, they'll either automatically get the checksuming along with the worse performance, or go back to a filesystem without the checksumming, and think it's fine as they know no different. Meanwhile, if they do it correctly there's no window without protection, as the torrent file can be used to double-verify the file once moved, as well, before deleting it. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html