Hi, we have the HA stack based on Pacemaker/Corosync with ZFS and Lustre resource agents on production, with setup provided by Laura and ZFS multi-mount protection. The main advantage is that the resources are moved automatically when there is a problem with the server or Lustre RPC. The main disadvantage we experienced is that sometimes the ZFS resource agents do not behave correctly with bigger ZFS pools remounted in one moment. The ZFS resource agents call 'zpool' commands so often that sometimes it causes a lock which needs to timeout and go into Pacemaker 'failed' state, later we need to cleanup the HA resources manually to redetect the current state and mount the pools. Sometimes it isn't automatic in our case.
Dominika -- Dominika Wanat Dział Pamięci Masowych ACK Cyfronet AGH tel.: +48 12 632 33 55 wew. 704
pgpRmKhwZGcRC.pgp
Description: Podpis cyfrowy OpenPGP
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org