Paul Koning via cctalk writes:
> Anything worth having around deserves backup. Which makes me wonder > -- how is Wikipedia backed up? I guess it has a fork, which isn't > quite the same thing. I know Bitsavers is replicated in a number of > places. And one argument in favor of GIT is that every workspace is a > full backup of the original, history and all. > > One should worry for smaller scale efforts, though. This is a problem I think about a lot. In the early 2000s I worked on the LOCKSS program at Stanford University. LOCKSS stands for "Lots Of Copies Keep Stuff Safe", and is a distributed network of servers that replicate backup copies of electronic academic journals. It stemmed from a research project that looked at how to design an attack resistent peer-to-peer digital archival network. Each node in the network keeps a copy of the original journal content, does a cryptographic hash of each resource (HTML page, image, PDF, etc.), and participates in a steady stream of polls with all the other nodes where they vote on the hashes. If a minority of nodes loses a poll, their content is assumed to be damaged, missing, or bad, and they replicate the content from the winners of the poll. It's designed as a "Dark" archive, meaning the data is there, but nobody tries to access it unless the original web content disappears. Then, the servers act as transparent web proxies, so when you hit the original URL or URI, they serve up the content that's now missing from the real public Internet. It's a neat idea. It's also open source, and unencumbered with patents. I've always thought a similar model could be used to archive and replicate just about anything, but it's just one of those things that nobody's ever gotten around to doing. > paul -Seth -- Seth Morabito Poulsbo, WA, USA w...@loomcom.com