Have you looked a the mmaddcallback command and specifically the file system mount callbacks?
Fred
__________________________________________________
Fred Stock | IBM Pittsburgh Lab | 720-430-8821
sto...@us.ibm.com
----- Original message -----
From: Ulrich Sibiller <u.sibil...@science-computing.de>
Sent by: gpfsug-discuss-boun...@spectrumscale.org
To: gpfsug-disc...@gpfsug.org
Cc:
Subject: [EXTERNAL] [gpfsug-discuss] wait for mount during gpfs startup
Date: Tue, Apr 28, 2020 7:05 AM
Hi,
when the gpfs systemd service returns from startup the filesystems are usually not mounted. So
having another service depending on gpfs is not feasible if you require the filesystem(s).
Therefore we have added a script to the systemd gpfs service that waits for all local gpfs
filesystems being mounted. We have added that script via ExecStartPost:
------------------------------------------------------------
# cat /etc/systemd/system/gpfs.service.d/waitmount.conf
[Service]
ExecStartPost=/usr/local/sc-gpfs/sbin/wait-for-all_local-mounts.sh
TimeoutStartSec=200
-------------------------------------------------------------
The script itself is not doing much:
-------------------------------------------------------------
#!/bin/bash
#
# wait until all _local_ gpfs filesystems are mounted. It ignored
# filesystems where mmlsfs -A does not report "yes".
#
# returns 0 if all fs are mounted (or none are found in gpfs configuration)
# returns non-0 otherwise
# wait for max. TIMEOUT seconds
TIMEOUT=180
# leading space is required!
FS=" $(/usr/lpp/mmfs/bin/mmlsfs all_local -Y 2>/dev/null | grep :automaticMountOption:yes: | cut -d:
-f7 | xargs; exit ${PIPESTATUS[0]})"
# RC=1 and no output means there are no such filesystems configured in GPFS
[ $? -eq 1 ] && [ "$FS" = " " ] && exit 0
# uncomment this line for testing
#FS="$FS gpfsdummy"
while [ $TIMEOUT -gt 0 ]; do
for fs in ${FS}; do
if findmnt $fs -n &>/dev/null; then
FS=${FS/ $fs/}
continue 2;
fi
done
[ -z "${FS// /}" ] && break
(( TIMEOUT -= 5 ))
sleep 5
done
if [ -z "${FS// /}" ]; then
exit 0
else
echo >&2 "ERROR: filesystem(s) not found in time:${FS}"
exit 2
fi
--------------------------------------------------
This works without problems on _most_ of our clusters. However, not on all. Some of them show what I
believe is a race condition and fail to startup after a reboot:
----------------------------------------------------------------------
# journalctl -u gpfs
-- Logs begin at Fri 2020-04-24 17:11:26 CEST, end at Tue 2020-04-28 12:47:34 CEST. --
Apr 24 17:12:13 myhost systemd[1]: Starting General Parallel File System...
Apr 24 17:12:17 myhost mmfs[5720]: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.
Apr 24 17:13:44 myhost systemd[1]: gpfs.service start-post operation timed out. Stopping.
Apr 24 17:13:44 myhost mmremote[8966]: Shutting down!
Apr 24 17:13:48 myhost mmremote[8966]: Unloading modules from
/lib/modules/3.10.0-1062.18.1.el7.x86_64/extra
Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfs26
Apr 24 17:13:48 myhost mmremote[8966]: Unloading module mmfslinux
Apr 24 17:13:48 myhost systemd[1]: Failed to start General Parallel File System.
Apr 24 17:13:48 myhost systemd[1]: Unit gpfs.service entered failed state.
Apr 24 17:13:48 myhost systemd[1]: gpfs.service failed.
----------------------------------------------------------------------
The mmfs.log shows a bit more:
----------------------------------------------------------------------
# less /var/adm/ras/mmfs.log.previous
2020-04-24_17:12:14.609+0200: runmmfs starting (4254)
2020-04-24_17:12:14.622+0200: [I] Removing old /var/adm/ras/mmfs.log.* files:
2020-04-24_17:12:14.658+0200: runmmfs: [I] Unloading modules from
/lib/modules/3.10.0-1062.18.1.el7.x86_64/extra
2020-04-24_17:12:14.692+0200: runmmfs: [I] Unloading module mmfs26
2020-04-24_17:12:14.901+0200: runmmfs: [I] Unloading module mmfslinux
2020-04-24_17:12:15.018+0200: runmmfs: [I] Unloading module tracedev
2020-04-24_17:12:15.057+0200: runmmfs: [I] Loading modules from
/lib/modules/3.10.0-1062.18.1.el7.x86_64/extra
Module Size Used by
mmfs26 2657452 0
mmfslinux 809734 1 mmfs26
tracedev 48618 2 mmfs26,mmfslinux
2020-04-24_17:12:16.720+0200: Node rebooted. Starting mmautoload...
2020-04-24_17:12:17.011+0200: [I] This node has a valid standard license
2020-04-24_17:12:17.011+0200: [I] Initializing the fast condition variables at 0x5561DFC365C0 ...
2020-04-24_17:12:17.011+0200: [I] mmfsd initializing. {Version: 5.0.4.2 Built: Jan 27 2020
12:13:06} ...
2020-04-24_17:12:17.011+0200: [I] Cleaning old shared memory ...
2020-04-24_17:12:17.012+0200: [I] First pass parsing mmfs.cfg ...
2020-04-24_17:12:17.013+0200: [X] Cannot open configuration file /var/mmfs/gen/mmfs.cfg.
2020-04-24_17:12:20.667+0200: mmautoload: Starting GPFS ...
2020-04-24_17:13:44.846+0200: mmremote: Initiating GPFS shutdown ...
2020-04-24_17:13:47.861+0200: mmremote: Starting the mmsdrserv daemon ...
2020-04-24_17:13:47.955+0200: mmremote: Unloading GPFS kernel modules ...
2020-04-24_17:13:48.165+0200: mmremote: Completing GPFS shutdown ...
--------------------------------------------------------------------------
Starting the gpfs service again manually then works without problems. Interestingly the missing
mmfs.cfg _is there_ after the shutdown, it gets created shortly after the failure. That's why I am
assuming a race condition:
--------------------------------------------------------------------------
# stat /var/mmfs/gen/mmfs.cfg
File: ‘/var/mmfs/gen/mmfs.cfg’
Size: 408 Blocks: 8 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 268998265 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:var_t:s0
Access: 2020-04-27 17:12:19.801060073 +0200
Modify: 2020-04-24 17:12:17.617823441 +0200
Change: 2020-04-24 17:12:17.659823405 +0200
Birth: -
--------------------------------------------------------------------------
Now, the interesting part:
- removing the ExecStartPost script makes the issue vanish. Reboot is always startign gpfs successfully
- reducing the ExecStartPost to simply one line ("exit 0") makes the issue stay. gpfs startup always
fails.
Unfortunately IBM is refusing support because "the script is not coming with gpfs".
So I am searching for a solution that makes the script work on those servers again. Or a better way
to wait for all local gpfs mounts being ready. Has anyone written something like that already?
Thank you,
Uli
--
Science + Computing AG
Vorstandsvorsitzender/Chairman of the board of management:
Dr. Martin Matzke
Vorstand/Board of Management:
Matthias Schempp, Sabine Hohenstein
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Aufsichtsrat/Supervisory Board:
Martin Wibbe, Ursula Morgenstern
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss