Dear All, I can contribute a few simple scripts to coordinate the start / stop of the whole Lustre file system. Everyone is welcome to use it or modify it to fit the usage of your system. Sorry that I did not prepare a completed document for these scripts. Here I only mention some relevant usages briefly. If you are interested in more details, I will be happy to answer here.
- server:/opt/lustre/etc/cfs-chome: The configuration file, where the Lustre file system is named "chome". The head node is named "server", which is also one of the Lustre clients. This file lists all the MGS, MDS, OSS, and lustre clients. If MGS and MDS have both ethernet and infiniband networks, you can specify their IP explicitly. If MDT or OST were formatted by ZFS, you can list them as well. - server:/opt/lustre/etc/cfsd: The main script to coordinate the start / stop / shutdown (emergent shutdown) of the Lustre system, running in the head node. The usage is: # cd /opt/lustre/etc/ # ./cfsd start chrome # ./cfsd stop chrome # ./cfsd shutdown When doing "start", it will do the following procedures (the script will ssh into each file servers and clients to do the mount): 1. If some of the MDT/OST were based on ZFS, it starts ZFS of these MDT/OST first. 2. Mount MGT, MDT, and OST in order. 3. Mount all the clients. When doing "stop", it will reverse the above procedures to do unmount. When doing "shutdown", usually used when the air-conditioner of the computing room is broken, and the whole room is in a emergent state that we need to shutdown the whole system as fast as possible: 1. Shutdown all the clients (for the head node, only unmount Lustre without shutdown) right now. 2. Unmount all the OST, MDT, MGT, and then shutdown these servers. 3. Shutdown the head node. - client:/etc/init.d/lustre_mnt: Sometimes the clients have to be rebooted, and we want it to mount Lustre automatically, or unmount Lustre correctly during shutdown. This script do this work. It reads /opt/lustre/etc/cfs-chome to check whether all the file servers are alive, determine whether it should mount Lustre through ethernet or infiniband, and do the mount. When doing unmount, after unmount it also unload all the Lustre kernel modules. The usage is: # /etc/init.d/lustre_mnt start # /etc/init.d/lustre_mnt stop - client:/etc/systemd/system/sysinit.target.wants/lustre_mnt.service: If the client has infiniband network, it is very annoying that it will stop OpenIB quite quickly before shutdown the Lustre mounts, and then hang the system without power-off. Hence, this file is to tell systemd to wait for /etc/init.d/lustre_mnt stop and then proceed the shutdown of OpenIB. Please note that these scripts may have bugs when used in variety environments. And also note that these scripts does not implement the case of Lustre HA (because we don't have). If you have any suggestions, I will be very appreciated. I am also very happy if you could find them useful. Cheers, T.H.Hsieh Bertschinger, Thomas Andrew Hjorth via lustre-discuss < lustre-discuss@lists.lustre.org> 於 2023年12月7日 週四 上午12:01寫道: > Hello Jan, > > You can use the Pacemaker / Corosync high-availability software stack for > this: specifically, ordering constraints [1] can be used. > > Unfortunately, Pacemaker is probably over-the-top if you don't need HA -- > its configuration is complex and difficult to get right, and it > significantly complicates system administration. One downside of Pacemaker > is that it is not easy to decouple the Pacemaker service from the Lustre > services, meaning if you stop the Pacemaker service, it will try to stop > all of the Lustre services. This might make it inappropriate for use cases > that don't involve HA. > > Given those downsides, if others in the community have suggestions on > simpler means to accomplish this, I'd love to see other tools that can be > used here (especially officially supported ones, if they exist). > > [1] > https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/constraints.html#specifying-the-order-in-which-resources-should-start-stop > > - Thomas Bertschinger > > ________________________________________ > From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf > of Jan Andersen <j...@comind.io> > Sent: Wednesday, December 6, 2023 3:27 AM > To: lustre > Subject: [EXTERNAL] [lustre-discuss] Coordinating cluster start and > shutdown? > > Are there any tools for coordinating the start and shutdown of lustre > filesystem, so that the OSS systems don't attempt to mount disks before the > MGT and MDT are online? > _______________________________________________ > _______________________________________________ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >
cfs-chome
Description: Binary data
cfsd
Description: Binary data
lustre_mnt
Description: Binary data
lustre_mnt.service
Description: Binary data
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org