On 04/11/19 15:41, deloptes wrote:
Not sure if true - for example you make daily, weekly and monthly backups
(classical) Lets focus on the daily part. On day 3 the files is broken.
You have to recover from day 2. The file is not broken for day 2 - correct?!
If I'm not wrong deduplication "is a technique for eliminating duplicate
copies of repeating data".
I'm not a borg expert and it performs deduplication on data chunk.
Suppose that you backup 2000 files in a day and inside this backup a
chunk is deduped and referenced by 300 files. If the deduped chunk is
broken I think you will lost it on 300 referenced files/chunks. This is
not good for me.
if your main dataset has a broken file, no problem, you can recovery
from backups.
If your saved deduped chunk is broken all files that has reference to it
could be broken. I think also that the same chunk will be used for
successive backups (always for deduplication) so this single chunk could
be used from backup1 to backupN.
It has also integrity check but don't know if check this. I read also
that integrity check on bigsized dataset could require too much time.
In my mind a backup is a copy of file in window time and if needed in
another window time another copy could be picked but it could not be a
reference to a previous copy. Today there are people that make backups
on tape (expensive) for reliability. I run backups on disks. Disks are
cheap so compression (that require time in backup and restore) and
deduplication (that add complexity) are not needed for me and they don't
affect really my free disk space because I can add a disk.
Rsnapshot uses hardlink that is similar.
All this solutions are valid if them fit your needs. You must choose how
important are data inside your backups and if losing a chunk deduped
could make damage to your backup dataset in a timeline.
Ah if you have multiple server to backup, I prefer bacula because can
pull data from hosts and can backup multiple server from the same point
(maybe using for each client a separated bacula-sd daemon with dedicated
storage).