You need to run writeconf on all targets at the same time, and mount in a specific order. That is documented in th Lustre Operations Manual.
Cheers, Andreas On Jan 18, 2023, at 03:49, Edmondson, Edward via lustre-discuss <lustre-discuss@lists.lustre.org> wrote: Hi all, I'm struggling to get my OSS mounts online after a less than clean shutdown. I'm on lustre 2.12.9. Plenty of googling etc doesn’t bring up anything that seems particular to the problem I’m having unfortunately. lnet seems to be up, pings ok both ways, communications clearly happen between the nodes judging by the logs. I've been through the log reconfiguration process with --writeconf on everything, step by step as in the manual On the OSS node when I try to mount: mount.lustre: mount /dev/mapper/lustre-oss0 at /mnt/oss0 failed: No such file or directory Is the MGS specification correct? Is the filesystem name correct? If upgrading, is the copied client log valid? (see upgrade docs) In logs: Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(ldlm_lib.c:494:client_obd_setup()) can't add initial connection Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(lwp_dev.c:125:lwp_setup()) lustre-MDT0000-lwp-OST0000: client obd setup error: rc = -2 Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(lwp_dev.c:273:lwp_init0()) lustre-MDT0000-lwp-OST0000: setup lwp failed. -2 Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(obd_config.c:559:class_setup()) setup lustre-MDT0000-lwp-OST0000 failed (-2) Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(obd_mount.c:202:lustre_start_simple()) lustre-MDT0000-lwp-OST0000 setup error -2 Jan 18 10:27:56 nas-0-4 kernel: LustreError: 31015:0:(obd_mount_server.c:671:lustre_lwp_setup()) lustre-MDT0000-lwp-OST0000: setup up failed: rc -2 Jan 18 10:27:56 nas-0-4 kernel: LustreError: 15c-8: MGC10.3.255.200@o2ib: The configuration from log 'lustre-client' failed (-2). This may be the result of communication errors between this node and the MGS, a bad configuration, or other errors. See the syslog for more information. Jan 18 10:27:56 nas-0-4 kernel: LustreError: 30961:0:(obd_mount_server.c:1414:server_start_targets()) lustre-OST0000: failed to start LWP: -2 Jan 18 10:27:56 nas-0-4 kernel: LustreError: 30961:0:(obd_mount_server.c:1992:server_fill_super()) Unable to start targets: -2 Jan 18 10:27:56 nas-0-4 kernel: Lustre: Failing over lustre-OST0000 Jan 18 10:27:57 nas-0-4 kernel: LustreError: 30961:0:(ldlm_lockd.c:3203:ldlm_cleanup()) ldlm still has namespaces; clean these up first. Jan 18 10:27:57 nas-0-4 kernel: LustreError: 30961:0:(ldlm_lockd.c:2862:ldlm_put_ref()) ldlm_cleanup failed: -16 Jan 18 10:27:57 nas-0-4 kernel: Lustre: server umount lustre-OST0000 complete Jan 18 10:27:57 nas-0-4 kernel: LustreError: 30961:0:(obd_mount.c:1604:lustre_fill_super()) Unable to mount (-2) On the MGS/MDT node (which has now mounted the MGS and MDT fine): Jan 18 10:27:56 nas-0-3 kernel: Lustre: MGS: Connection restored to 24758df3-a11a-f5db-18a5-2e0e35f2099d (at 10.3.255.199@o2ib) Jan 18 10:27:56 nas-0-3 kernel: Lustre: MGS: Regenerating lustre-OST0000 log by user request: rc = 0 Jan 18 10:27:56 nas-0-3 kernel: Lustre: Found index 0 for lustre-OST0000, updating log Jan 18 10:27:56 nas-0-3 kernel: Lustre: Client log for lustre-OST0000 was not updated; writeconf the MDT first to regenerate it. The MDT has absolutely been writeconfed so that last message isn't terribly helpful. fscks are clean, so there's not a problem there. Any advice hugely appreciated! -- Dr Edd Edmondson HPC Systems Manager Dept of Physics and Astronomy University College London (he/him) During remote working email is the best way to contact me. If needed I am available by phone on 0203 108 1399, by Microsoft Teams, or other methods by arrangement. _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org