On Wednesday 08 January 2014 06:42:36 pm Alan Smith wrote: > So now its down to rsync via chron, or software raid... >
Now that I understand, I'd recommend you do what I do, or something along these lines...... Software RAID-1. with a twist. Each physical disk has 3 partitions. 1 swap, 1 very basic minimal system, 1 everything else. This way, disk reads are faster, ( it's striped in the RAID driver ) but disk writes are slower. ( must write the same data twice ) Swap is striped, so swap is faster. ( if you're short on RAM ) Either disk can function complete as a non-RAID stand alone device, and/or can be accessed directly without a RAID driver in the kernel, in the event it becomes necessary. ( some day, some way, it will ) The basic system is one very small RAID-1 partition. ( about 200 MB ) /boot /bin /sbin /etc /lib a number of mount points, and not much else. Things that essentially never change. That partition is also backed up on a bootable CD. Everything else is a RAID-1 partition that houses files that can, do, or might change with some frequency. That md partition is rsync'd to another machine as a backup. RAID IS **NOT** A BACKUP !! RAID will not help you if the motherboard fails, for instance. Periodically ( daily, weekly ) removing one of the physical disks, putting it on a shelf, and replacing it with another good disk, does leave you with a backup on the shelf. rsync can do this for you, but requires another machine on the network to house that backup. The kernel maintains mdstat in /proc which is the current health of the RAID md devices dynamically updated constantly. A cron job compares a recorded copy of mdstat to the dynamic /proc/mdstat file every 5 minutes or so. If they are not identicle, something has changed, and I want to know very quickly. The system starts screaming for attention, sending e-mails, flashing the screen, beeping the speaker.... but keeps on running off of the non-failed device. If the disks were purchased and installed at the same time, you've got about a week to deal with it. Disks manufactured at the same time, with the same run time on them, tend to fail within about 8 days of each other on average. Maybe longer, but I wouldn't. That way, I can replace the failed drive, re-boot and let the system rebuild the RAID with about 5 minutes total off-line down time. Rebuilding a 2 terabyte RAID takes hours, but the system is up and running while it happens, with a little planning. If there is a spare disk in the machine, the RAID driver can swap out the failed disk all by itself, but that's a little advanced. -- Cowboy http://cowboy.cwf1.com Never eat more than you can lift. -- Miss Piggy _______________________________________________ Rivendell-dev mailing list Rivendell-dev@lists.rivendellaudio.org http://caspian.paravelsystems.com/mailman/listinfo/rivendell-dev