Re: [gpfsug-discuss] wait for mount during gpfs startup
On Thu, Apr 30, 2020 at 12:50:27PM +0200, Ulrich Sibiller wrote: > Am 28.04.20 um 15:57 schrieb Skylar Thompson: > >> Have you looked a the mmaddcallback command and specifically the file > system mount callbacks? > > > We use callbacks successfully to ensure Linux auditd rules are only loaded > > after GPFS is mounted. It was easy to setup, and there's very fine-grained > > events that you can trigger on: > > Thanks. But how do set this up for a systemd service? Disable the dependent > service and start it > from the callback? Create some kind of state file in the callback and let the > dependent systemd > service check that flag file in a busy loop? Use inotify for the flag file? In the pre-systemd days, I would say just disable the service and let the callback handle it. I do see your point, though, that you lose the other systemd ordering benefits if you start the service from the callback. Assuming you're still able to start the service via systemctl, I would probably just leave it disabled and let the callback handle it. In the case of auditd rules, it's not actually a service (just a command that needs to be run) so we didn't run into this specific problem. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
I now better understand the functionality you were aiming to achieve. You want anything in systemd that is dependent on GPFS file systems being mounted to block until they are mounted. Currently we do not offer any such feature though as Carl Zetie noted there is an RFE for such functionality, RFE 125955 ( https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=125955 ). For the mmaddcallback what I was thinking could resolve your problem was for you to create a either a "startup" callback or "mount" callbacks for your file systems. I thought you could use those callbacks to track the file systems of interest and then use the appropriate means to integrate that information into the flow of systemd. I have never done this so perhaps it is not possible. Regards, The Spectrum Scale (GPFS) team -- If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=----0479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Ulrich Sibiller To: gpfsug-discuss@spectrumscale.org Date: 04/30/2020 06:57 AM Subject: [EXTERNAL] Re: [gpfsug-discuss] wait for mount during gpfs startup Sent by:gpfsug-discuss-boun...@spectrumscale.org Am 28.04.20 um 15:57 schrieb Skylar Thompson: >> Have you looked a the mmaddcallback command and specifically the file system mount callbacks? > We use callbacks successfully to ensure Linux auditd rules are only loaded > after GPFS is mounted. It was easy to setup, and there's very fine-grained > events that you can trigger on: Thanks. But how do set this up for a systemd service? Disable the dependent service and start it from the callback? Create some kind of state file in the callback and let the dependent systemd service check that flag file in a busy loop? Use inotify for the flag file? Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=KmkFZ30Ey3pB4QnhsP2vS2mmojVLAWGrIiStGaE0320&s=VHWoLbiq119iFhL724WAQwg4dSJ3KRNVSXnfrFBv9RQ&e= ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Am 28.04.20 um 13:38 schrieb Jonathan Buzzard: Yuck, and double yuck. There are many things you can say about systemd (and I have a choice few) but one of them is that it makes this sort of hackery obsolete. At least that is one of it goals. A systemd way to do it would be via one or more helper units. So lets assume your GPFS file system is mounted on /gpfs, then create a file called ismounted.txt on it and then create a unit called say gpfs_mounted.target that looks like # gpfs_mounted.target [Unit] TimeoutStartSec=infinity ConditionPathExists=/gpfs/ismounted.txt ExecStart=/usr/bin/sleep 10 RemainAfterExit=yes Then the main unit gets Wants=gpfs_mounted.target After=gpfs_mounted.target If you are using scripts in systemd you are almost certainly doing it wrong :-) Yes, that the right direction. But still not the way I'd like it to be. First, I don't really like the flag file stuff. Imagine the mess you'd create if multiple services would require flag files... Second, I am looking for an all_local target. That one cannot be solved using this approach, right? (same for all_remote or all) Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Am 28.04.20 um 13:55 schrieb Hannappel, Juergen: a gpfs.mount target should be automatically created at boot by the systemd-fstab-generator from the fstab entry, so no need with hackery like ismountet.txt... A generic gpfs.mount target does not seem to exist on my system. There are only specific mount targets for the mounted gpfs filesystems. So I'd need to individually configure each depend service on each system with the filesystem for wait for. My approach was more general in just waiting for all_local gpfs filesystems. So I can use the same configuration everywhere. Besides, I have once tested and found that these targets are not usable because of some oddities but unfortunately I don't remember details. But the outcome was my script from the initial post. Maybe it was that there's no automatic mount target for all_local, same problem as above. Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Am 28.04.20 um 15:57 schrieb Skylar Thompson: >> Have you looked a the mmaddcallback command and specifically the file system mount callbacks? > We use callbacks successfully to ensure Linux auditd rules are only loaded > after GPFS is mounted. It was easy to setup, and there's very fine-grained > events that you can trigger on: Thanks. But how do set this up for a systemd service? Disable the dependent service and start it from the callback? Create some kind of state file in the callback and let the dependent systemd service check that flag file in a busy loop? Use inotify for the flag file? Uli -- Science + Computing AG Vorstandsvorsitzender/Chairman of the board of management: Dr. Martin Matzke Vorstand/Board of Management: Matthias Schempp, Sabine Hohenstein Vorsitzender des Aufsichtsrats/ Chairman of the Supervisory Board: Philippe Miltin Aufsichtsrat/Supervisory Board: Martin Wibbe, Ursula Morgenstern Sitz/Registered Office: Tuebingen Registergericht/Registration Court: Stuttgart Registernummer/Commercial Register No.: HRB 382196 ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup (Ulrich Sibiller)
I’ve also voted and commented on the ticket, but I’ll say this here: If the amount of time I spent on this alone (and I like to think I’m pretty good with this sort of thing, and am somewhat of a systemd evangelist when the opportunity presents itself), this has caused a lot of people a lot of pain — including time spent when their kludge to make this work causes some other problem, or having to reboot nodes in a much more manual way at times to ensure one of these nodes doesn’t dump work while it has no FS, etc. -- || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Apr 28, 2020, at 8:10 AM, Carl Zetie - ca...@us.ibm.com > wrote: > > There’s an RFE related to this: RFE 125955 > (https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=125955) > > I recommend that people add their votes and comments there as well as > discussing it here in the UG. > > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Has anyone confirmed this? At one point, I mucked around with this somewhat endlessly to try to get something sane and systemd-based to work and ultimately surrendered and inserted a 30 second delay. I didn’t try the “check for the presence of a file” thing as I’m allergic to that sort of thing (at least more allergic than I am to a time-based delay). I believe everything that I tried happens before the mount is complete. -- || \\UTGERS, |---*O*--- ||_// the State | Ryan Novosielski - novos...@rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\of NJ | Office of Advanced Research Computing - MSB C630, Newark `' > On Apr 28, 2020, at 7:55 AM, Hannappel, Juergen > wrote: > > Hi, > a gpfs.mount target should be automatically created at boot by the > systemd-fstab-generator from the fstab entry, so no need with hackery like > ismountet.txt... > > > - Original Message - >> From: "Jonathan Buzzard" >> To: gpfsug-discuss@spectrumscale.org >> Sent: Tuesday, 28 April, 2020 13:38:01 >> Subject: Re: [gpfsug-discuss] wait for mount during gpfs startup > >> Yuck, and double yuck. There are many things you can say about systemd >> (and I have a choice few) but one of them is that it makes this sort of >> hackery obsolete. At least that is one of it goals. >> >> A systemd way to do it would be via one or more helper units. So lets >> assume your GPFS file system is mounted on /gpfs, then create a file >> called ismounted.txt on it and then create a unit called say >> gpfs_mounted.target that looks like >> >> >> # gpfs_mounted.target >> [Unit] >> TimeoutStartSec=infinity >> ConditionPathExists=/gpfs/ismounted.txt >> ExecStart=/usr/bin/sleep 10 >> RemainAfterExit=yes >> >> Then the main unit gets >> >> Wants=gpfs_mounted.target >> After=gpfs_mounted.target >> >> If you are using scripts in systemd you are almost certainly doing it >> wrong :-) >> >> JAB. >> >> -- >> Jonathan A. Buzzard Tel: +44141-5483420 >> HPC System Administrator, ARCHIE-WeSt. >> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG >> ___ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
We use callbacks successfully to ensure Linux auditd rules are only loaded after GPFS is mounted. It was easy to setup, and there's very fine-grained events that you can trigger on: https://www.ibm.com/support/knowledgecenter/STXKQY_5.0.4/com.ibm.spectrum.scale.v5r04.doc/bl1adm_mmaddcallback.htm On Tue, Apr 28, 2020 at 11:30:38AM +, Frederick Stock wrote: > Have you looked a the mmaddcallback command and specifically the file system > mount callbacks? -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup (Ulrich Sibiller)
There’s an RFE related to this: RFE 125955 (https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=125955) I recommend that people add their votes and comments there as well as discussing it here in the UG. Carl Zetie Program Director Offering Management Spectrum Scale (919) 473 3318 ][ Research Triangle Park ca...@us.ibm.com [signature_1027147421] ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Hi, a gpfs.mount target should be automatically created at boot by the systemd-fstab-generator from the fstab entry, so no need with hackery like ismountet.txt... - Original Message - > From: "Jonathan Buzzard" > To: gpfsug-discuss@spectrumscale.org > Sent: Tuesday, 28 April, 2020 13:38:01 > Subject: Re: [gpfsug-discuss] wait for mount during gpfs startup > Yuck, and double yuck. There are many things you can say about systemd > (and I have a choice few) but one of them is that it makes this sort of > hackery obsolete. At least that is one of it goals. > > A systemd way to do it would be via one or more helper units. So lets > assume your GPFS file system is mounted on /gpfs, then create a file > called ismounted.txt on it and then create a unit called say > gpfs_mounted.target that looks like > > > # gpfs_mounted.target > [Unit] > TimeoutStartSec=infinity > ConditionPathExists=/gpfs/ismounted.txt > ExecStart=/usr/bin/sleep 10 > RemainAfterExit=yes > > Then the main unit gets > > Wants=gpfs_mounted.target > After=gpfs_mounted.target > > If you are using scripts in systemd you are almost certainly doing it > wrong :-) > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > ___ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
On 28/04/2020 11:57, Ulrich Sibiller wrote: Hi, when the gpfs systemd service returns from startup the filesystems are usually not mounted. So having another service depending on gpfs is not feasible if you require the filesystem(s). Therefore we have added a script to the systemd gpfs service that waits for all local gpfs filesystems being mounted. We have added that script via ExecStartPost: Yuck, and double yuck. There are many things you can say about systemd (and I have a choice few) but one of them is that it makes this sort of hackery obsolete. At least that is one of it goals. A systemd way to do it would be via one or more helper units. So lets assume your GPFS file system is mounted on /gpfs, then create a file called ismounted.txt on it and then create a unit called say gpfs_mounted.target that looks like # gpfs_mounted.target [Unit] TimeoutStartSec=infinity ConditionPathExists=/gpfs/ismounted.txt ExecStart=/usr/bin/sleep 10 RemainAfterExit=yes Then the main unit gets Wants=gpfs_mounted.target After=gpfs_mounted.target If you are using scripts in systemd you are almost certainly doing it wrong :-) JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG ___ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss
Re: [gpfsug-discuss] wait for mount during gpfs startup
Have you looked a the mmaddcallback command and specifically the file system mount callbacks? Fred__Fred Stock | IBM Pittsburgh Lab | 720-430-8821sto...@us.ibm.com - Original message -From: Ulrich Sibiller Sent by: gpfsug-discuss-boun...@spectrumscale.orgTo: gpfsug-disc...@gpfsug.orgCc:Subject: [EXTERNAL] [gpfsug-discuss] wait for mount during gpfs startupDate: Tue, Apr 28, 2020 7:05 AM Hi,when the gpfs systemd service returns from startup the filesystems are usually not mounted. Sohaving another service depending on gpfs is not feasible if you require the filesystem(s).Therefore we have added a script to the systemd gpfs service that waits for all local gpfsfilesystems being mounted. We have added that script via ExecStartPost:# cat /etc/systemd/system/gpfs.service.d/waitmount.conf[Service]ExecStartPost=/usr/local/sc-gpfs/sbin/wait-for-all_local-mounts.shTimeoutStartSec=200-The script itself is not doing much:-#!/bin/bash## wait until all _local_ gpfs filesystems are mounted. It ignored# filesystems where mmlsfs -A does not report "yes".## returns 0 if all fs are mounted (or none are found in gpfs configuration)# returns non-0 otherwise# wait for max. TIMEOUT secondsTIMEOUT=180# leading space is required!FS=" $(/usr/lpp/mmfs/bin/mmlsfs all_local -Y 2>/dev/null | grep :automaticMountOption:yes: | cut -d:-f7 | xargs; exit ${PIPESTATUS[0]})"# RC=1 and no output means there are no such filesystems configured in GPFS[ $? -eq 1 ] && [ "$FS" = " " ] && exit 0# uncomment this line for testing#FS="$FS gpfsdummy"while [ $TIMEOUT -gt 0 ]; do for fs in ${FS}; do if findmnt $fs -n &>/dev/null; then FS=${FS/ $fs/} continue 2; fi done [ -z "${FS// /}" ] && break (( TIMEOUT -= 5 )) sleep 5doneif [ -z "${FS// /}" ]; then exit 0else echo >&2 "ERROR: filesystem(s) not found in time:${FS}" exit 2fi--This works without problems on _most_ of our clusters. However, not on all. Some of them show what Ibelieve is a race condition and fail to startup after a reboot:--# journalctl -u gpfs-- Logs begin at Fri 2020-04-24 17:11:26 CEST, end at Tue 2020-04-28 12:47:34 CEST. --Apr 24 17:12:13 myhost systemd[1]: Starting General Parallel File System...Apr 24 17:12:17 myhost mmfs[5720]: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.Apr 24 17:13:44 myhost systemd[1]: gpfs.service start-post operation timed out. Stopping.Apr 24 17:13:44 myhost mmremote[8966]: Shutting down!Apr 24 17:13:48 myhost mmremote[8966]: Unloading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extraApr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfs26Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfslinuxApr 24 17:13:48 myhost systemd[1]: Failed to start General Parallel File System.Apr 24 17:13:48 myhost systemd[1]: Unit gpfs.service entered failed state.Apr 24 17:13:48 myhost systemd[1]: gpfs.service failed.--The mmfs.log shows a bit more:--# less /var/adm/ras/mmfs.log.previous2020-04-24_17:12:14.609+0200: runmmfs starting (4254)2020-04-24_17:12:14.622+0200: [I] Removing old /var/adm/ras/mmfs.log.* files:2020-04-24_17:12:14.658+0200: runmmfs: [I] Unloading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extra2020-04-24_17:12:14.692+0200: runmmfs: [I] Unloading module mmfs262020-04-24_17:12:14.901+0200: runmmfs: [I] Unloading module mmfslinux2020-04-24_17:12:15.018+0200: runmmfs: [I] Unloading module tracedev2020-04-24_17:12:15.057+0200: runmmfs: [I] Loading modules from/lib/modules/3.10.0-1062.18.1.el7.x86_64/extraModule Size Used bymmfs26 2657452 0mmfslinux 809734 1 mmfs26tracedev 48618 2 mmfs26,mmfslinux2020-04-24_17:12:16.720+0200: Node rebooted. Starting mmautoload...2020-04-24_17:12:17.011+0200: [I] This node has a valid standard license2020-04-24_17:12:17.011+0200: [I] Initializing the fast condition variables at 0x5561DFC365C0 ...2020-04-24_17:12:17.011+0200: [I] mmfsd initializing. {Version: 5.0.4.2 Built: Jan 27 202012:13:06} ...2020-04-24_17:12:17.011+0200: [I] Cleaning old shared memory ...2020-04-24_17:12:17.012+0200: [I] First pass parsing mmfs.cfg ...2020-04-24_17:12:17.013+0200: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.2020-04-24_17:12:20.667+0200: mmautoload: Starting GPFS ...2020-04-24_17:13:44.846+0200: mmremote: Initiating GPFS shutdown ...2020-04-24_17:13:47.861+0200: mmremote: Starting the mmsdrserv daemon ...2020-04-24_17:13:47.955+0200: mmremote: Unloading GPFS kernel
[gpfsug-discuss] wait for mount during gpfs startup
Hi, when the gpfs systemd service returns from startup the filesystems are usually not mounted. So having another service depending on gpfs is not feasible if you require the filesystem(s). Therefore we have added a script to the systemd gpfs service that waits for all local gpfs filesystems being mounted. We have added that script via ExecStartPost: # cat /etc/systemd/system/gpfs.service.d/waitmount.conf [Service] ExecStartPost=/usr/local/sc-gpfs/sbin/wait-for-all_local-mounts.sh TimeoutStartSec=200 - The script itself is not doing much: - #!/bin/bash # # wait until all _local_ gpfs filesystems are mounted. It ignored # filesystems where mmlsfs -A does not report "yes". # # returns 0 if all fs are mounted (or none are found in gpfs configuration) # returns non-0 otherwise # wait for max. TIMEOUT seconds TIMEOUT=180 # leading space is required! FS=" $(/usr/lpp/mmfs/bin/mmlsfs all_local -Y 2>/dev/null | grep :automaticMountOption:yes: | cut -d: -f7 | xargs; exit ${PIPESTATUS[0]})" # RC=1 and no output means there are no such filesystems configured in GPFS [ $? -eq 1 ] && [ "$FS" = " " ] && exit 0 # uncomment this line for testing #FS="$FS gpfsdummy" while [ $TIMEOUT -gt 0 ]; do for fs in ${FS}; do if findmnt $fs -n &>/dev/null; then FS=${FS/ $fs/} continue 2; fi done [ -z "${FS// /}" ] && break (( TIMEOUT -= 5 )) sleep 5 done if [ -z "${FS// /}" ]; then exit 0 else echo >&2 "ERROR: filesystem(s) not found in time:${FS}" exit 2 fi -- This works without problems on _most_ of our clusters. However, not on all. Some of them show what I believe is a race condition and fail to startup after a reboot: -- # journalctl -u gpfs -- Logs begin at Fri 2020-04-24 17:11:26 CEST, end at Tue 2020-04-28 12:47:34 CEST. -- Apr 24 17:12:13 myhost systemd[1]: Starting General Parallel File System... Apr 24 17:12:17 myhost mmfs[5720]: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg. Apr 24 17:13:44 myhost systemd[1]: gpfs.service start-post operation timed out. Stopping. Apr 24 17:13:44 myhost mmremote[8966]: Shutting down! Apr 24 17:13:48 myhost mmremote[8966]: Unloading modules from /lib/modules/3.10.0-1062.18.1.el7.x86_64/extra Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfs26 Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfslinux Apr 24 17:13:48 myhost systemd[1]: Failed to start General Parallel File System. Apr 24 17:13:48 myhost systemd[1]: Unit gpfs.service entered failed state. Apr 24 17:13:48 myhost systemd[1]: gpfs.service failed. -- The mmfs.log shows a bit more: -- # less /var/adm/ras/mmfs.log.previous 2020-04-24_17:12:14.609+0200: runmmfs starting (4254) 2020-04-24_17:12:14.622+0200: [I] Removing old /var/adm/ras/mmfs.log.* files: 2020-04-24_17:12:14.658+0200: runmmfs: [I] Unloading modules from /lib/modules/3.10.0-1062.18.1.el7.x86_64/extra 2020-04-24_17:12:14.692+0200: runmmfs: [I] Unloading module mmfs26 2020-04-24_17:12:14.901+0200: runmmfs: [I] Unloading module mmfslinux 2020-04-24_17:12:15.018+0200: runmmfs: [I] Unloading module tracedev 2020-04-24_17:12:15.057+0200: runmmfs: [I] Loading modules from /lib/modules/3.10.0-1062.18.1.el7.x86_64/extra Module Size Used by mmfs26 2657452 0 mmfslinux 809734 1 mmfs26 tracedev 48618 2 mmfs26,mmfslinux 2020-04-24_17:12:16.720+0200: Node rebooted. Starting mmautoload... 2020-04-24_17:12:17.011+0200: [I] This node has a valid standard license 2020-04-24_17:12:17.011+0200: [I] Initializing the fast condition variables at 0x5561DFC365C0 ... 2020-04-24_17:12:17.011+0200: [I] mmfsd initializing. {Version: 5.0.4.2 Built: Jan 27 2020 12:13:06} ... 2020-04-24_17:12:17.011+0200: [I] Cleaning old shared memory ... 2020-04-24_17:12:17.012+0200: [I] First pass parsing mmfs.cfg ... 2020-04-24_17:12:17.013+0200: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg. 2020-04-24_17:12:20.667+0200: mmautoload: Starting GPFS ... 2020-04-24_17:13:44.846+0200: mmremote: Initiating GPFS shutdown ... 2020-04-24_17:13:47.861+0200: mmremote: Starting the mmsdrserv daemon ... 2020-04-24_17:13:47.955+0200: mmremote: Unloading GPFS kernel modules ... 2020-04-24_17:13:48.165+0200: mmremote: Completing GPFS shutdown ... -- Starting the gpfs service again manually then works without problems. Interestingly the missing mmfs.cfg _is there_ after the shutdown, it gets created shortly after the failu