On Tue, Jul 12, 2005 at 04:18:28PM -0600, Chris Saunders wrote: > > I will be the only one doing these backups and I would like the process > to be as automated as possible. Our linux boxes run traffic simulation > models that can take 25 to 30 hours to run. Our engineers relentlessly > run these boxes almost non-stop. Our simulations are extremly heavy on > the I/O, so running a backup while a simulation is running is out of the > question. And since these simulations take so long to run, telling my > engineers they cannot use the machine over the weekend because of a > scheduled backup would not work at all. > > So I need a smart backup plan that can adjust to different schedules > based on machine avaliability, but at the end of the week have a dump of > each machine. Any ideas? > Is there a way for AMANDA to determine if a machine is busy or idle and > do the backup only when it is idle? On the same token can it pause its > backup of that box if another simulation is started.
Amanda does not have the ability to work around other processing happening on the clients. I'm not aware of any backup software that does. The typical method of handling a compute farm such as this is to arrange for the compute nodes to not have any data on them that needs to be backed up in the first place. Input data for the jobs gets copied from a remote fileserver to local disk, the job runs, the output gets copied back to the fileserver, and the input and intermediate files get removed from the local disk. Failure recovery of the compute nodes is handled by an auto-install setup (kickstart, autoyast, jumpstart, etc.). Disk died? Replace disk, boot from network, and reinstall hands off. The key here is getting the postinstall scripts complete to the point where you can kick off the install and forget about it, knowing that the system will get configured exactly the way it needs to be without any further intervention.