On Wed, Nov 22, 2023 at 04:16:00AM +0100, i...@tutanota.com wrote: > Running disks in RAID1 or RAID5 (pick your poison) with softraid. > > Then for every important big file use something like par2cmdline to > create parity data. > > par2cmdline can be used to verify and re-create files. > > I would perhaps also create simple checksums for files as well, because > that's faster to run through a script, checking all files, than > par2verify. > > For smaller files, perhaps put them into a version control system with > integrity checking and parity rather than the above. > > Of course backup is essential, it's not about that. > > Running a script that checks all checksums is a "poor mans" version of > ZFS scrubbing. If bit rot is found, repair the file with par2 parity. > > For send/receive, if needed, I think rsync is adequate as it also uses > checksums to validate the transfer of files. > > Any feedback? Do you do something similar on OpenBSD?
We have been doing "something similar", in fact much simpler, on OpenBSD and other unix-like systems for > 25 years. It's trivially simple to protect your data, and you along with 99.999% of other people seem to be over thinking it. 1. Once data is no longer "work in progress", archive it to write-only media and take it out of the regular backup loop. In most cases this drastically reduces the volume of data you need to manage. Feel free to keep a local on-line copy on a regular disk too for faster access. 2. Write scripts to copy data that matters elsewhere automatically. This can be another drive in the local machine, or even another partition on the same disk. This takes the place of your "RAID-1 or RAID-5", and is much, much less error-prone because it's just copying files around. 3. Write a script to verify the copy with the original version and highlight changes. (Ours is 18 lines of shell script.) 4. Write a script to create and verify a file of checksums in the current directory. (Also not complicated - ours is 15 lines of shell script.) We have kept many Tb of data intact and free of bitrot for decades using simple methods like this. No need for fancy filesystems or command line parity tools, just use tar, sha256 and crucially a little bit of intelligence, and the problem is solved. And yes, we have certainly seen bitflips when reading from disk, reading from SSDs, (which overall seem _worse_ for random unreported bit flipping), and also bad system RAM which causes data in the buffer cache to be corrupted. All of these threats are easily mitigated with tar and sha256, and the aforementioned application of some intelligence to the problem. The only problem is that it doesn't have a flashy name like "ZFS".