I'm having trouble setting up what seems like should be a straightforward NFS-HA design. It is similar to what Christoforos Christoforou attempted to do earlier in 2020 (https://www.mail-archive.com/users@clusterlabs.org/msg09671.html).

My goal is to balance multiple NFS exports across two nodes to effectively have an "active-active" configuration. Each export should only be available from one node at a time, but they should be able to freely fail back and forth to balance between the two nodes.

I'm also hoping to isolate each exported filesystem to its own set of underlying disks, to prevent heavy IO on one exported filesystem from affecting another one. So each filesystem to be exported should be backed by a unique volume group.

I've set up two nodes with fencing, an ethmonitor clone, and the following two resource groups.

"""
  * Resource Group: ha1:
    * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
    * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
    * alice_nfs    (ocf::heartbeat:nfsserver):    Started host1
    * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
    * alice_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
    * alice_login01    (ocf::heartbeat:exportfs):    Started host1
    * alice_login02    (ocf::heartbeat:exportfs):    Started host1
  * Resource Group: ha2:
    * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
    * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
    * bob_nfs    (ocf::heartbeat:nfsserver):    Started host2
    * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
    * bob_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
    * bob_login01    (ocf::heartbeat:exportfs):    Started host2
    * bob_login02    (ocf::heartbeat:exportfs):    Started host2
"""

We had an older storage appliance that used Red Hat HA on RHEL 6 (back when it still used RGManager and not Pacemaker), and it was capable of load-balanced NFS-HA like this.

The problem with this approach using Pacemaker is that the "nfsserver" resource agent only wants one instance per host. During a failover event, both "nfsserver" RAs will try to bind mount the NFS shared info directory to /var/lib/nfs/. Only one will claim the directory.

If I convert everything to a single resource group as Christoforos did, then the cluster is active-passive, and all the resources fail as a single unit. Having one node serve all the exports while the other is idle doesn't seem very ideal.

I'd like to eventually have something like this:

"""
  * Resource Group: ha1:
    * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
    * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
    * charlie_lvm    (ocf::heartbeat:LVM-activate):    Started host1
    * charlie_xfs    (ocf::heartbeat:Filesystem):    Started host1
    * ha1_nfs    (ocf::heartbeat:nfsserver):    Started host1
    * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
    * charlie_ip    (ocf::heartbeat:IPaddr2):    Started host1
    * ha1_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
    * alice_login01    (ocf::heartbeat:exportfs):    Started host1
    * alice_login02    (ocf::heartbeat:exportfs):    Started host1
    * charlie_login01    (ocf::heartbeat:exportfs):    Started host1
    * charlie_login02    (ocf::heartbeat:exportfs):    Started host1
  * Resource Group: ha2:
    * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
    * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
    * david_lvm    (ocf::heartbeat:LVM-activate):    Started host2
    * david_xfs    (ocf::heartbeat:Filesystem):    Started host2
    * ha2_nfs    (ocf::heartbeat:nfsserver):    Started host2
    * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
    * david_ip    (ocf::heartbeat:IPaddr2):    Started host2
    * ha2_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
    * bob_login01    (ocf::heartbeat:exportfs):    Started host2
    * bob_login02    (ocf::heartbeat:exportfs):    Started host2
    * david_login01    (ocf::heartbeat:exportfs):    Started host2
    * david_login02    (ocf::heartbeat:exportfs):    Started host2
"""

Or even this:

"""
  * Resource Group: alice_research:
    * alice_lvm    (ocf::heartbeat:LVM-activate):    Started host1
    * alice_xfs    (ocf::heartbeat:Filesystem):    Started host1
    * alice_nfs    (ocf::heartbeat:nfsserver):    Started host1
    * alice_ip    (ocf::heartbeat:IPaddr2):    Started host1
    * alice_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
    * alice_login01    (ocf::heartbeat:exportfs):    Started host1
    * alice_login02    (ocf::heartbeat:exportfs):    Started host1
  * Resource Group: charlie_research:
    * charlie_lvm    (ocf::heartbeat:LVM-activate):    Started host1
    * charlie_xfs    (ocf::heartbeat:Filesystem):    Started host1
    * charlie_nfs    (ocf::heartbeat:nfsserver):    Started host1
    * charlie_ip    (ocf::heartbeat:IPaddr2):    Started host1
    * charlie_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host1
    * charlie_login01    (ocf::heartbeat:exportfs):    Started host1
    * charlie_login02    (ocf::heartbeat:exportfs):    Started host1
  * Resource Group: bob_research:
    * bob_lvm    (ocf::heartbeat:LVM-activate):    Started host2
    * bob_xfs    (ocf::heartbeat:Filesystem):    Started host2
    * bob_nfs    (ocf::heartbeat:nfsserver):    Started host2
    * bob_ip    (ocf::heartbeat:IPaddr2):    Started host2
    * bob_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
    * bob_login01    (ocf::heartbeat:exportfs):    Started host2
    * bob_login02    (ocf::heartbeat:exportfs):    Started host2
  * Resource Group: david_research:
    * david_lvm    (ocf::heartbeat:LVM-activate):    Started host2
    * david_xfs    (ocf::heartbeat:Filesystem):    Started host2
    * david_nfs    (ocf::heartbeat:nfsserver):    Started host2
    * david_ip    (ocf::heartbeat:IPaddr2):    Started host2
    * david_nfsnotify    (ocf::heartbeat:nfsnotify):    Started host2
    * david_login01    (ocf::heartbeat:exportfs):    Started host2
    * david_login02    (ocf::heartbeat:exportfs):    Started host2
"""

Is there a way to have a load-balanced NFS-HA solution with Pacemaker/Corosync? Can I make a clone set of the nfsserver resource while the rest fail back and forth, or find some other workaround? Do I need to modify the existing resource agent?


Thanks,
Billy
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to