Vetch wrote: > I have a two site network, one in the US, and one in the UK. > Our bandwidth is limited, though will be increasing at some point in the > future, though I couldn't say how much... > I want to backup my data from one site to the other... > In order to assess whether that would be do-able, I went to an > exhibition of backup technologies. > One that caught my eye was a company called Data Domain, who claimed to > de-duplicate data at the block level of 16KB chunks... > Apparently, all they send are the changed chunks and the schema to > retrieve the data.
Backuppc can use rsync to transfer the data. Rsync works by reading through the file at both ends, exchanging block checksums to find the changed parts. > What I am wondering is would BackupPC be a suitable open source > replacement for that technology...? > Does it send the changed data down the line and then check to see if it > already has a copy, or does it check then send? It can do either, depending on whether you use the tar, smb, or rsync transfer methods. > Presumably it would save significant bandwidth if it checks first... > The other thing is, can BackupPC de-duplicate at the block level or is > it just file level? > I'm thinking that block level might save considerable amounts of > traffic, because we will need to send file dumps of Exchange databases > over the wire... > ... Which I assume will mean that we've got about 16GB at least to copy > everyday, since it'll be creating a new file daily... > > On the other hand, would 16KB blocks be duplicated that regularly - I > imagine there is a fair amount of variability in 16KB of ones and zeros, > and the chances of them randomly reoccurring without being part of the > same file, I would say are slim... > > What do you think? I think rsync will do it as well as it can be done. However, it is hard to tell how much two different Exchange database dumps will have in common. Then there is the issue that you could reduce the size by compressing the file but doing so will make the common parts impossible to find from one version to another. You can work around this by using ssh compression or something like an openvpn tunnel with lzo compression enabled, leaving the file uncompressed. You can test the transfer efficiency locally first to get an idea of how well the common blocks are handled. Use the command line rsync program to make a copy of one days's dump, then repeat the process the next day with the same filename. Rsync will display the size of the file and the data actually transferred. -- Les Mikesell [EMAIL PROTECTED] ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/backuppc-users http://backuppc.sourceforge.net/