Re: [gentoo-user] Backup program that compresses data but only changes new files.

J. Roeleveld Mon, 15 Aug 2022 11:34:42 -0700

On Monday, August 15, 2022 9:05:24 AM CEST Dale wrote:
> Rich Freeman wrote:
> > On Sun, Aug 14, 2022 at 6:44 PM Dale <rdalek1...@gmail.com> wrote:
> >> Right now, I'm using rsync which doesn't compress files but does just
> >> update things that have changed.  I'd like to find some way, software
> >> but maybe there is already a tool I'm unaware of, to compress data and
> >> work a lot like rsync otherwise.
> > 
> > So, how important is it that it work exactly like rsync?
> > 
> > I use duplicity, in part because I've been using it forever.  Restic
> > seems to be a similar program most are using these days which I
> > haven't looked at super-closely but I'd look at that first if starting
> > out.
> > 
> > Duplicity uses librsync, so it backs up exactly the same data as rsync
> > would, except instead of replicating entire files, it creates streams
> > of data more like something like tar.  So if you back up a million
> > small files you might get out 1-3 big files.  It can compress and
> > encrypt the data as you wish.  The downside is that you don't end up
> > with something that looks like your original files - you have to run
> > the restore process to extract them all back out.  It is extremely
> > space-efficient though - if 1 byte changes in the middle of a 10GB
> > file you'll end up just backing up maybe a kilobyte or so (whatever
> > the block size is), which is just like rsync.
> > 
> > Typically you rely on metadata to find files that change which is
> > fast, but I'm guessing you can tell these programs to do a deep scan
> > which of course requires reading the entire contents, and that will
> > discover anything that was modified without changing ctime/mtime.
> > 
> > The output files can be split to any size, and the index info (the
> > metadata) is separate from the raw data.  If you're storing to
> > offline/remote/cloud/whatever storage typically you keep the metadata
> > cached locally to speed retrieval and to figure out what files have
> > changed for incrementals.  However, if the local cache isn't there
> > then it will fetch just the indexes from wherever it is stored
> > (they're small).
> > 
> > It has support for many cloud services - I store mine to AWS S3.
> > 
> > There are also some options that are a little closer to rsync like
> > rsnapshot and burp.  Those don't store compressed (unless there is an
> > option for that or something), but they do let you rotate through
> > multiple backups and they'll set up hard links/etc so that they are
> > de-duplicated.  Of course hard links are at the file level so if 1
> > byte inside a file changes you'll end up with two full copies.  It
> > will still only transfer a single block so the bandwidth requirements
> > are similar to rsync.
> 
> Duplicity sounds interesting except that I already have the drive
> encrypted.  Keep in mind, these are external drives that I hook up long
> enough to complete the backups then back in a fire safe they go.  The
> reason I mentioned being like rsync, I don't want to rebuild a backup
> from scratch each time as that would be time consuming.  I thought of
> using Kbackup ages ago and it rebuilds from scratch each time but it
> does have the option of compressing.  That might work for small stuff
> but not many TBs of it.  Back in the early 90's, I remember using a
> backup software that was incremental.  It would only update files that
> changed and would do it over several floppy disks and compressed it as
> well.  Something like that nowadays is likely rare if it exists at all
> since floppies are long dead.  I either need to split my backup into two
> pieces or compress my data.  That is why I mentioned if there is a way
> to backup first part of alphabet in one command, switch disks and then
> do second part of alphabet to another disk.


Actually, there still is a piece of software that does this:
" app-backup/dar "
You can tell it to split the backups into slices of a specific size.

--
Joost

Re: [gentoo-user] Backup program that compresses data but only changes new files.

Reply via email to