Craig Barratt wrote at about 15:12:07 -0700 on Wednesday, June 3, 2009: > Tino writes: > > > Well, we've already got MD4 checksums of file blocks. And if I > > understand everything correctly, we DO GET collisions, therefore the > > hash chains. > > These collisions are because the BackupPC digest is only computed > over the first part of the file. > > > Of course, this if for 256k blocks, IIRC. And "only" 128 bit hashes. > > But I don't like the idea of relying on probabilities. I've got enough > > uncertainties by flaky hardware, bugs etc. > > Lessfs uses 192 bit checksums on each file block. The chance of > a collision is vanishingly small (and vanishingly smaller than a > 128 bit checksum). I don't know if it compares the actual data > (as BackupPC does). Perhaps that could be added as an additional > option for the paranoid.
Suppose you have 1 Petabyte (2^50) of *pool* data. (I doubt anybody has more than that using BackupPC). That means you have 2^50/2^18 = 2^32 256K blocks of storage. Note we are using pool data since that puts an *upper* bound on unique 256K blocks of data. As per my earlier posting, the chance of at least one collision with a 192-bit checksum is approximately: 1- e^(-2^32 * (2^32-1)/2^193) ~ 2^(-129) ~ 1/( 6.7 x 10^38 ) This is a 1 in 670 trillion trillion trillion chance! If you think a Petabyte is too small and want to worry about an Exabyte (2^60) of *pool* data then the chance of at least one collision drops to a mere: ~ 2^(-109) ~ 1/(6.5 x 10^32) This is merely a 1 in 650 million trillion trillion chance! Again this is just a generalization of the birthday problem... If you have a problem trusting a 1 in 6.7 x 10^38 chance or even a 1 in 6.5 x 10^32 chance then you must be truly PARANOID or you must have REALLY reliable hardware if this adds to your stability concerns. Also, let me know when your pool directory starts exceeding 1 Petabyte or even 1 Exabyte in size... > > > I won't trust such a file system for backup data. > > Most commercial systems use these techniques. > > Craig > > ------------------------------------------------------------------------------ > OpenSolaris 2009.06 is a cutting edge operating system for enterprises > looking to deploy the next generation of Solaris that includes the latest > innovations from Sun and the OpenSource community. Download a copy and > enjoy capabilities such as Networking, Storage and Virtualization. > Go to: http://p.sf.net/sfu/opensolaris-get > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > ------------------------------------------------------------------------------ OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/