Kubota, "Veritas Storage Foundation™ Cluster File System Administrator's Guide" explains it all, read about "Split-brain and jeopardy handling" and "Fencing" paragraphs.
Thanks. On Tue, Jan 18, 2011 at 6:26 AM, Kubota, Harald <harald.kub...@baml.com>wrote: > Hi, > > We have in our company thousands of clusters and use VCS a lot, and at > least in 2 out of 3 regions there is no I/O fencing happening at all. > > We rely solely and so far quite successfully on heartbeat links. > > We had recently an issue which would have been handled better if we had > I/O fencing implemented: in one cluster the CPU got too busy so no network > replies were received, the rest of the cluster thought it's dead, brought > online all resources of the service groups running on the busy node, et > voila: 2 nodes mounting the same filesystem. The busy node was not too busy > to not write some stuff to disk. Once it was less busy, the expected > concurrency violation appeared and all was restored, but not until the > filesystem for corrupted. Restore from tape fixed it, but that was not fun > and very time intensive. > > That sounds like I/O fencing is THE way to go, except this is the very > first time this was seen, and I wonder if adding I/O fencing to all > clusters makes sense: while it reduces the (small) risk of this happening, > it add a (small) complexity to a cluster design, which potentially causes a > lot of unnecessary reboots. > > What is the general recommendation for I/O fencing via SCSI reservations > (which I understand is what VCS implements)? > > Recommended to do? Optional? Obsolete with sufficient heartbeat links? > Dangerous and not recommended? > > Harald >
_______________________________________________ Veritas-ha maillist - Veritas-ha@mailman.eng.auburn.edu http://mailman.eng.auburn.edu/mailman/listinfo/veritas-ha