You can always run come sort of disk de-duplicater after you copy without -H
On Sun, Dec 13, 2009 at 9:56 PM, Jeffrey J. Kosowsky <backu...@kosowsky.org> wrote: > Robin Lee Powell wrote at about 20:18:55 -0800 on Sunday, December 13, 2009: > > > > I've only looked at the code briefly, but I believe this *should* be > > possible. I don't know if I'll be implementing it, at least not > > right away, but it shouldn't actually be that hard, so I wanted to > > throw it out so someone else could run with it if ey wants. > > > > It's an idea I had about rsync resumption: > > > > Keep an array of all the things you haven't backed up yet, starting > > with the inital arguments; let's say we're transferring "/a" and > > "/b" from the remote machine. > > > > Start by putting "a/" and "b/" in the array. Then get the directory > > listing for a/, and replace "a/" in the array with "a/d", "a/e", ... > > for all files and directories in there. When each file is > > transferred, it gets removed. Directories are replaced with their > > contents. > > > > If the transfer breaks, you can resume with that list of > > things-what-still-need-transferring/recursing-through without having > > to walk the parts of the tree you've already walked. > > > > This should solve the SIGPIPE problem. In fact, it could even deal > > with incrementals from things like laptops: if you have settings for > > NumRetries and RetryDelay, you could, say, retry every 60 seconds > > for a week if you wanted. > > > > On top of that, you could use the same retry system to > > *significantly* limit the memory usage: stop rsyncing every N files > > (where N is a config value). If you only do, say, 1000 files at a > > time, the memory usage will be very low indeed. > > > > Unfortunately, I don't think it is that simple. If it were, then rsync > would have been written that way back in version .001. I mean there is > a reason that rsync memory usage increases as the number of files > increases (even in 3.0) and it is not due to memory holes or ignorant > programmers. After all, your proposed fix is not exactly obscure. > > At least one reason is the need to keep track of inodes so that hard > links can be copied properly. In fact, I believe that without the -H > flag, rsync memory usage scales much better. Obviously if you break up > backups into smaller chunks or allow resumes without keeping track of > past inodes then you have no way of tracking hard links across the > filesystem. Maybe you don't care but if so, you could probably do just > about as well by dropping the --hard-links argument from RsyncArgs. > > I don't believe there is any easy way to get something for free here... > > ------------------------------------------------------------------------------ > Return on Information: > Google Enterprise Search pays you back > Get the facts. > http://p.sf.net/sfu/google-dev2dev > _______________________________________________ > BackupPC-users mailing list > BackupPC-users@lists.sourceforge.net > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > ------------------------------------------------------------------------------ Return on Information: Google Enterprise Search pays you back Get the facts. http://p.sf.net/sfu/google-dev2dev _______________________________________________ BackupPC-users mailing list BackupPC-users@lists.sourceforge.net List: https://lists.sourceforge.net/lists/listinfo/backuppc-users Wiki: http://backuppc.wiki.sourceforge.net Project: http://backuppc.sourceforge.net/