Just putting about a little feeler about this package I started writing last night. Wondering about its usefulness, current availability, and just overall interest. Designed for mod_perl use. Doesn't make much sense otherwise.
Don't want to go into too many details here, but File::Redundant, takes some unique word (hopefully guaranteed through a database: a mailbox, a username, a website, etc.) which I call a thing, a pool of dirs, and how many $copies you would like to maintain. From the pool of dirs, $copies good dirs are chosen, ordered by percent full on the given partition. When you open a file with my open method (along with close, this is the only override method I have written so far), you get a file handle. Do what you like on the file handle. When you close the file handle, with my close method, I CORE::close the file and use Rob Brown's File::DirSync to sync to all the directories. DirSync uses time stamps to very quickly sync changes between directory trees. When a dir can't be reached (box is down or what have you), $copies good dirs are re-chosen and the dirsync happens from good old data to the new good dirs. If too much stuff goes down, you're sorta outta luck, but you would have been without my system anyway. I would write methods for everything (within reason) you do to a file, open, close, unlink, rename, stat, etc. So who cares? Well, using this system would make it quite easy to keep track of really an arbitrarily large amount of data. The pool of dirs could be mounts from any number of boxes, located remotely or otherwise, and you could sync accordingly. If File::DirSync gets to the point where you can use ftp or scp, all the better. There are race conditions all over the place, and I plan on transactionalizing where I can. The whole system depends on how long the dirsync takes. In my experience, dirsync is very fast. Likely I would have dirsync'ing daemon(s), dirsync'ing as fast as they can. In some best case scenario, the most data that would ever get lost would be the time it takes to do one dirsync (usually less than a second for even very large amounts of data), and the loss would only happen if you were making changes on a dir as the dir went down. I would try to deal with boxes coming back up and keeping everything clean as best I could. So, it would be a work in progress, and hopefully get better as I went, but I would at least like to give it a shot. Earl