>>> Jehan-Guillaume de Rorthais <j...@dalibo.com> schrieb am 22.07.2021 um 12:05 in Nachricht <20210722120537.0d65c2a1@firost>: > On Wed, 21 Jul 2021 22:02:21 -0400 > "Frank D. Engel, Jr." <fde...@fjrhome.net> wrote: > >> In OpenVMS, the kernel is aware of the cluster. As is mentioned in that >> presentation, it actually stops processes from running and blocks access >> to clustered storage when quorum is lost, and resumes them appropriately >> once it is re-established. >> >> In other words... no reboot, no "death" of the cluster node or special >> arrangements with storage hardware... If connectivity is restored, the >> services are simply resumed. > > Well, when losing the quorum, by default Pacemaker stop its local resources.
But when a node without quorum performs any actions it may corrupt data (e.g. writing to a non-shared filesystem like ext3 on a shared medium like iSCSI or FC_SAN). IMHO the only safe action when loosing quorum is to stop any action immediately. That does NOT mean to STOP resources; instead it means "immediate deatch", probably even without syncing disks. > Considering a clustered storage, the resources are the lock manager, iscsi > or > some other mean, FS etc. > > However, if the resources stop actions doesn't succeed, THEN the node reset > itself. Should your cluster have active fencing, the node might be reset by > some > external mean. > > As Digimer wrote, «Quorum is a tool for when things are working > predictably». > To do some rewording in regard with the current topic: if Pacemaker is able > to > stop its resources after a quorum lost, it will not reboot, no "death" > either. > >> I had a 3-node OpenVMS cluster running virtualized at one point on the >> hobbyist license and my cluster storage for that setup was simply to >> mirror the disks across the three nodes (via software which is >> integrated into OpenVMS); almost like RAID 1 across the network. If I >> "broke" the cluster and one of the servers lost quorum (due to >> connectivity) it would just sit and wait for the connectivity to be >> restored, then resync the storage and pick up essentially where it left off. > > I believe this might be possible using a Pacemaker stack. However, I never > built such a cluster. So hopefully some other people around there with more > experience on clustered FS will infirm or confirm with some more details. I think pacemaker would need kernel support for that (cease all disk operations then invalidate all disk buffers and re-read them). Regards, Ulrich > > Regards, > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/