On Monday, August 15, 2022 9:05:24 AM CEST Dale wrote: > Rich Freeman wrote: > > On Sun, Aug 14, 2022 at 6:44 PM Dale <rdalek1...@gmail.com> wrote: > >> Right now, I'm using rsync which doesn't compress files but does just > >> update things that have changed. I'd like to find some way, software > >> but maybe there is already a tool I'm unaware of, to compress data and > >> work a lot like rsync otherwise. > > > > So, how important is it that it work exactly like rsync? > > > > I use duplicity, in part because I've been using it forever. Restic > > seems to be a similar program most are using these days which I > > haven't looked at super-closely but I'd look at that first if starting > > out. > > > > Duplicity uses librsync, so it backs up exactly the same data as rsync > > would, except instead of replicating entire files, it creates streams > > of data more like something like tar. So if you back up a million > > small files you might get out 1-3 big files. It can compress and > > encrypt the data as you wish. The downside is that you don't end up > > with something that looks like your original files - you have to run > > the restore process to extract them all back out. It is extremely > > space-efficient though - if 1 byte changes in the middle of a 10GB > > file you'll end up just backing up maybe a kilobyte or so (whatever > > the block size is), which is just like rsync. > > > > Typically you rely on metadata to find files that change which is > > fast, but I'm guessing you can tell these programs to do a deep scan > > which of course requires reading the entire contents, and that will > > discover anything that was modified without changing ctime/mtime. > > > > The output files can be split to any size, and the index info (the > > metadata) is separate from the raw data. If you're storing to > > offline/remote/cloud/whatever storage typically you keep the metadata > > cached locally to speed retrieval and to figure out what files have > > changed for incrementals. However, if the local cache isn't there > > then it will fetch just the indexes from wherever it is stored > > (they're small). > > > > It has support for many cloud services - I store mine to AWS S3. > > > > There are also some options that are a little closer to rsync like > > rsnapshot and burp. Those don't store compressed (unless there is an > > option for that or something), but they do let you rotate through > > multiple backups and they'll set up hard links/etc so that they are > > de-duplicated. Of course hard links are at the file level so if 1 > > byte inside a file changes you'll end up with two full copies. It > > will still only transfer a single block so the bandwidth requirements > > are similar to rsync. > > Duplicity sounds interesting except that I already have the drive > encrypted. Keep in mind, these are external drives that I hook up long > enough to complete the backups then back in a fire safe they go. The > reason I mentioned being like rsync, I don't want to rebuild a backup > from scratch each time as that would be time consuming. I thought of > using Kbackup ages ago and it rebuilds from scratch each time but it > does have the option of compressing. That might work for small stuff > but not many TBs of it. Back in the early 90's, I remember using a > backup software that was incremental. It would only update files that > changed and would do it over several floppy disks and compressed it as > well. Something like that nowadays is likely rare if it exists at all > since floppies are long dead. I either need to split my backup into two > pieces or compress my data. That is why I mentioned if there is a way > to backup first part of alphabet in one command, switch disks and then > do second part of alphabet to another disk.
Actually, there still is a piece of software that does this: " app-backup/dar " You can tell it to split the backups into slices of a specific size. -- Joost