Hi Guillermo!
On 2024-08-21 12:40, Guillermo Rozas wrote:
> 4) ... but there is no means for BackupPC itself to *react* on the issue;
> it simply warns, but there is no recovery.
>
> Instead, even on a new backup when the original file is still available
> on one of the host, the server copy of the file is just taken to be "fine".
> I.e., BackupPC detects that a file with the same hash is already existing in
> the pool, so it happily assumes there's no need to retransfer the file. It's
> simply kept as is, although it was detected as broken in step 3) - this
> information is not propagated to or used within subsequent backup jobs.
>
> *If* this is really correct, it's a serious resilience problem with
> BackupPC. At least, it fails any "principle of least surprise" check.
>
> Of course, BackupPC can't be expected to recreate data that is lost on
> both server and clients.
> But *some* remedial action can and should be taken. One somewhat sane
> reaction, IIUC, would be to move the broken file from the pool into some
> "quarantine/damaged" area (it could still be "mostly" okay and contain useful
> information after all, if anything else is lost).
>
> At the very least, the pool file should be marked as "suspicious" such
> that, if it is found on some host during a backup again, a fresh copy will be
> created in the pool. Or, if you are concerned about hash collisions, a new
> copy with _1 appended should be recreated and used for the new backups from
> this point onwards.
>
> The same approach should be taken about attrib files. Same logic: If the
> folder on the host remained unchanged, a new backup should recover any
> information that BackupPC_fsck detected as lost on the server.
>
> I'd totally understand if some manual intervention is required (stopping
> BackupPC, running some rescue commands etc.) - but from what I understand
> from Ghislain, there's nothing to help apart from microsurgery or creating an
> entirely new BackupPC instance, losing all history. And that's the opposite
> of rock-solid.
> I've yet to confirm, but my own experience from the last couple of weeks
> seems to support this observation.
>
>
> From https://backuppc.github.io/backuppc/BackupPC.html
> <https://backuppc.github.io/backuppc/BackupPC.html>:
>
> <quote>
> "An rsync "full" backup now uses --checksum (instead of --ignore-times),
> which is much more efficient on the server side - the server just needs to
> check the full-file checksum computed by the client, together with the mtime,
> nlinks, size attributes, to see if the file has changed. If you want a more
> conservative approach, you can change it back to --ignore-times, which
> requires the server to send block checksums to the client."
> <\quote>
>
> By default V4 uses --checksum for full backups, but that has the (slight)
> risk of missing file corruption on the server because it trusts the hash
> calculated the first time it got the file (that's why I wrote the script I
> mentioned before).
Haha, thanks for
# Thanks
To Alexander Kobel that originally gave me the idea of the script. To
Craig Barratt for the great piece of software that is BackupPC.
there! ;-)
> If you change it back to --ignore-times it will re-test the server files by
> comparing block-by-block checksums with the one on the client. If a file is
> corrupted in the server this will detect the difference and update it as a
> "new version" of the file from then on. However, --ignore-times is MUCH
> slower than --checksum, so you may want to run it only in response to a
> corruption suspicion and not regularly.
That's a quite reasonable explanation, yes. Nevertheless, I wonder if the
proper reaction for BackupPC_fsck should be to move an obviously invalid file
out of the way (i.e. to a "quarantine" location or so), to enforce that check -
after all, the corruption has already been detected and confirmed.
@ Ghislain: So the assumption is that both running a backup without --checksum
or deleting the corrupted pool files should serve the same purpose (recreation
from the host).
I can confirm with my installation that deleted pool files are properly
recreated (at least the original data files, not sure about attrib files);
perhaps now you can confirm what happens for a backup without --checksum...
Cheers,
Alex
_______________________________________________
BackupPC-users mailing list
[email protected]
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: https://github.com/backuppc/backuppc/wiki
Project: https://backuppc.github.io/backuppc/