On 9/2/23 12:15, Michel Verdier wrote:
On 2023-09-02, Stefan Monnier wrote:
I switched to Bup a few years ago and saw a significant reduction in the
size of my backups that is partly due to the deduplication *between*
machines (I backup several Debian machines to the same backup
repository) as well as because the deduplication occurs even when I move
files around (most obvious when I move directories filled with large
files like videos or music).
I setup deduplication between hosts with rsnapshot as you do. But it was
a small gain in my case as the larger part was users data, logs and the
like. So always different between hosts. I gain only on system
files. Mainly /etc as I don't backup binaries and libs.
I almost never move large directories. But if needed it's easy to move it
also in rsnapshot directories.
I have a SOHO LAN:
* My primary workstation is Debian Xfce on a 60GB 2.5" SATA SSD with 1G
boot, 1G swap, and 12G root partitions. It has one user (myself) with
minimal home data (e-mail and CVS working directories). I backup boot
and root.
* I keep the vast majority of my data on a FreeBSD server with Samba and
the CVS repository (via SSH) on a ZFS stripe of two mirrors containing
two 3TB 3.5" SATA HDD's each (e.g. 6TB RAID10). I backup the Samba data.
* I run rsync(1) and homebrew shell/ Perl scripts on the server to
backup the various LAN sources to backup destination file system tree on
the server. I have enabled ZFS compression on the pool and enabled
deduplication on the backup tree.
I ran some statistics for the daily driver backups in March. The
results were 4.9 GB backup size, 258 backups, 1.2 TB apparent total
backup storage, and 29.0 GB actual total backup storage. So, a savings
of about 42:1:
https://www.mail-archive.com/debian-user@lists.debian.org/msg789807.html
Today, I collected some statistics for the backups of my data on the
file server:
2023-09-02 14:10:30 toor@f3 ~
# du -hsx /jail/samba/var/local/samba/dpchrist
693G /jail/samba/var/local/samba/dpchrist
2023-09-02 14:11:09 toor@f3 ~
# ls /jail/samba/var/local/samba/dpchrist/.zfs/snapshot | wc -l
98
2023-09-02 14:13:50 toor@f3 ~
# du -hs /jail/samba/var/local/samba/dpchrist/.zfs/snapshot
67T /jail/samba/var/local/samba/dpchrist/.zfs/snapshot
2023-09-02 14:19:24 toor@f3 ~
# zfs get
compression,compressratio,dedup,used,usedbydataset,usedbysnapshots
p3/ds2/samba/dpchrist | sort
NAME PROPERTY VALUE SOURCE
p3/ds2/samba/dpchrist compression lz4 inherited from p3
p3/ds2/samba/dpchrist compressratio 1.02x -
p3/ds2/samba/dpchrist dedup off default
p3/ds2/samba/dpchrist used 777G -
p3/ds2/samba/dpchrist usedbydataset 693G -
p3/ds2/samba/dpchrist usedbysnapshots 84.2G -
So, 693 GB backup size, 98 backups, 67 TB apparent total backup storage,
and 777 GB actual total backup storage. So, a savings of about 88:1.
What statistics are other readers seeing for similar use-cases and their
backup solutions?
David