On 2023-10-12, Chris Lamb wrote: >> In the meantime, I worked on a naive implementation of this, using >> debmirror and btrfs snapshots (zfs or xfs are other likely candidates >> for filesystem-level snapshots). It is working better than I expected! > […] >> Currently weighing in at about 550GB, each snapshot of the archive for >> amd64+all+source is weighing in under 330GB if I recall correctly... so >> that is over a month worth of snapshots for the cost of about two full >> snapshots. Obviously, adding more architectures would dramatically >> increase the space used (Would probably add arm64, armhf, i386, ppc64el >> and riscv64 if I were to do this again). > > This sounds like great progress. :) Do you have any updates since you > posted your message?
It's still running! And now I have one running with xfs filesystem, and one on btrfs. Only the xfs one is publicly available via: http://snapshot.reproducible-builds.org/snapshot-experiment Which only started earlier this month, but in theory could pull in the updates from the btrfs snapshots for a little more redundancy. Also managed to backfill from snapshot.debian.org some older generations, maybe as far back as july? That's only available on the currently not publicly accesible btrfs implementation. Could probably set up some proxy to make the ones on btrfs available publicly too. > (Are you snapshotting after each dinstall and labelling them with some > timestamp…? Or perhaps you have some other, cleverer, scheme?) The timestamp i am using is the most recent timestamp from any relevent Release file. This way, the timestamp for any given mirror state, (presuming you are mirroring the same distributions and architectures), should match if you had two snapshots running independently. For the main repositories (e.g. not security or incoming), I am syncing from a leaf mirror that happens to be very close on the network, so I just schedule it to run from cron roughly when I expect the leaf mirror to be finished updating, and then a second pass some hours later just in case so we are less likely to miss a snapshot generation or get an incomplete generation. Really want to avoid missing snapshots or partial snapshots; that could certainly use some more solid error checking, as it mostly relies on debmirror doing the right thing. It also is currently missing debian-installer images, though I *think* that would be reasonably easy to add by passing more arguments to debmirror. For the first proof-of-concept I focused on .deb and .udeb packages, to be able to rebuild packages. live well, vagrant