I'm running a 2-node failover cluster that has a fairly traditional configuration - using DRBD to maintain synchronize data across 2 nodes, and pacemaker to failover Xen virtual machines. Underneath it all, I'm running software RAID10 on the individual nodes.
I've been thinking of adding a couple more servers to my rack, and looking for ways to set up a more general migration/failover scheme - and I keep coming back to DRBD as a limitation: even with a cluster file system, without a SAN, it looks like synchronizing 2 nodes is my limit. Which leads me to wonder about spreading a software RAID volume (md) across drives published via AoE or iSCSI, configured to survive a node failure that takes multiple drives offline. Which leads to several questions about how to assemble and manage the resulting RAID device. Assuming a separate gigE or 10gigE network linking all the nodes, onto which all nodes publish their drives: 1. The simple case would be to assemble and mount the RAID device on the same node as the VM, then use pacemaker to manage the md device using Pacemaker. There seem to be several resource agents for managing RAID devices, but each one seems to be either commented with lots of caveats, and/or deprecated. So... has anybody done this? Does it work? How's the performance? 2. An alternate model would be to do something with a cluster file system - to keep all the components mounted, on all nodes in the cluster, with failover just happening automatically if a node and its drives die. But I'm sort of at a loss on how to proceed here. Suggestions? Pointers to learning resources? Thanks much, Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems