Re: [gentoo-user] Backup program that compresses data but only changes new files.

William Kenworthy Sun, 14 Aug 2022 23:02:01 -0700


On 15/8/22 06:44, Dale wrote:

Howdy,

With my new fiber internet, my poor disks are getting a work out, and
also filling up.  First casualty, my backup disk.  I have one directory
that is . . . well . . . huge.  It's about 7TBs or so.  This is where it
is right now and it's still trying to pack in files.


/dev/mapper/8tb            7.3T  7.1T  201G  98% /mnt/8tb


Right now, I'm using rsync which doesn't compress files but does just
update things that have changed.  I'd like to find some way, software
but maybe there is already a tool I'm unaware of, to compress data and
work a lot like rsync otherwise.  I looked in app-backup and there is a
lot of options but not sure which fits best for what I want to do.
Again, backup a directory, compress and only update with changed or new
files.  Generally, it only adds files but sometimes a file gets replaced
as well.  Same name but different size.

I was trying to go through the list in app-backup one by one but to be
honest, most links included only go to github or something and usually
doesn't tell anything about how it works or anything.  Basically, as far
as seeing if it does what I want, it's useless. It sort of reminds me of
quite a few USE flag descriptions.

I plan to buy another hard drive pretty soon.  Next month is possible.
If there is nothing available that does what I want, is there a way to
use rsync and have it set to backup files starting with "a" through "k"
to one spot and then backup "l" through "z" to another?  I could then
split the files into two parts.  I use a script to do this now, if one
could call my little things scripts, so even a complicated command could
work, just may need help figuring out the command.

Thoughts?  Ideas?

Dale

:-)  :-)

The questions you need to ask is how compressible is the data and howmuch duplication is in there. Rsync's biggest disadvantage is itdoesn't keep history, so if you need to restore something from last weekyou are SOL. Honestly, rsync is not a backup program and should only beused the way you do for data that don't value as an rsync archive is adisaster waiting to happen from a backup point of view.

Look into dirvish - uses hard links to keep files current but safe, iseasy to restore (looks like a exact copy so you cp the files back ifneeded. Downside is it hammers the hard disk and has no compression soits only deduplication via history (my backups stabilised about 2xoriginal size for ~2yrs of history - though you can use something likebtrfs which has filesystem level compression.

My current program is borgbackup which is very sophisticated in how itstores data - its probably your best bet in fact. I am storingliterally tens of Tb of raw data on a 4Tb usb3 disk (going back yearsand yes, I do restore regularly, and not just for disasters but forspace efficient long term storage I access only rarely.


e.g.:

A single host:

------------------------------------------------------------------------------
                       Original size      Compressed size Deduplicated size

All archives: 3.07 TB 1.96 TB 151.80 GB


                       Unique chunks         Total chunks
Chunk index:                 1026085             22285913

Then there is my offline storage - it backs up ~15 hosts (in repos likethe above) + data storage like 22 years of email etc. Each host backs upto its own repo then the offline storage backs that up. Thededuplicated size is the actual on disk size ... compression varies asits whatever I used at the time the backup was taken ... currently Ihave it set to "auto,zstd,11" but it can be mixed in the same repo (arepo is a single backup set - you can nest repos which is what I do - so~45Tb stored on a 4Tb offline disk). One advantage of a system likethis is chunked data rarely changes, so its only the differences thatare backed up (read the borgbackup docs - interesting)


------------------------------------------------------------------------------
                       Original size      Compressed size Deduplicated size

All archives: 28.69 TB 28.69 TB 3.81 TB


                       Unique chunks         Total chunks
Chunk index:

Re: [gentoo-user] Backup program that compresses data but only changes new files.

Reply via email to