> Now what is the messages about "deleting orphaned objects" ? Is it normal > also ?
Yeah, this is kind of normal, and I'm even thinking we should lower the message verbosity... Andreas, do you agree that could become a simple CDEBUG(D_HA, ...) instead of LCONSOLE(D_INFO, ...)? Aurélien ________________________________ De : lustre-discuss <lustre-discuss-boun...@lists.lustre.org> de la part de Audet, Martin via lustre-discuss <lustre-discuss@lists.lustre.org> Envoyé : lundi 4 décembre 2023 20:26 À : Andreas Dilger <adil...@whamcloud.com> Cc : lustre-discuss@lists.lustre.org <lustre-discuss@lists.lustre.org> Objet : Re: [lustre-discuss] Error messages (ex: not available for connect from 0@lo) on server boot with Lustre 2.15.3 and 2.15.4-RC1 External email: Use caution opening links or attachments Hello Andrea, Thanks for your response. Happy to learn that the "errors" I was reporting aren't really errors. I now understand that the 3 messages about LDISKFS were only normal messages resulting from mounting the file systems (I was fooled by vim showing this message in red, like important error messages, but this is simply a false positive result of its syntax highlight rules probably triggered by the "errors=" string which is only a mount option...). Now what is the messages about "deleting orphaned objects" ? Is it normal also ? We boot the clients VMs always after the server is ready and we shutdown clients cleanly well before the vlmf Lustre server is (also cleanly) shutdown. It is a sign of corruption ? How come this happen if shutdowns are clean ? Thanks (and sorry for the beginners questions), Martin ________________________________ From: Andreas Dilger <adil...@whamcloud.com> Sent: December 4, 2023 5:25 AM To: Audet, Martin Cc: lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] Error messages (ex: not available for connect from 0@lo) on server boot with Lustre 2.15.3 and 2.15.4-RC1 ***Attention*** This email originated from outside of the NRC. ***Attention*** Ce courriel provient de l'extérieur du CNRC. It wasn't clear from your rail which message(s) are you concerned about? These look like normal mount message(s) to me. The "error" is pretty normal, it just means there were multiple services starting at once and one wasn't yet ready for the other. LustreError: 137-5: lustrevm-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. It probably makes sense to quiet this message right at mount time to avoid this. Cheers, Andreas On Dec 1, 2023, at 10:24, Audet, Martin via lustre-discuss <lustre-discuss@lists.lustre.org> wrote: Hello Lustre community, Have someone ever seen messages like these on in "/var/log/messages" on a Lustre server ? Dec 1 11:26:30 vlfs kernel: Lustre: Lustre: Build Version: 2.15.4_RC1 Dec 1 11:26:30 vlfs kernel: LDISKFS-fs (sdd): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc Dec 1 11:26:30 vlfs kernel: LDISKFS-fs (sdc): mounted filesystem with ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc Dec 1 11:26:30 vlfs kernel: LDISKFS-fs (sdb): mounted filesystem with ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc Dec 1 11:26:36 vlfs kernel: LustreError: 137-5: lustrevm-MDT0000_UUID: not available for connect from 0@lo (no target). If you are running an HA pair check that the target is mounted on the other server. Dec 1 11:26:36 vlfs kernel: Lustre: lustrevm-OST0001: Imperative Recovery not enabled, recovery window 300-900 Dec 1 11:26:36 vlfs kernel: Lustre: lustrevm-OST0001: deleting orphan objects from 0x0:227 to 0x0:513 This happens on every boot on a Lustre server named vlfs (a AlmaLinux 8.9 VM hosted on a VMware) playing the role of both MGS and OSS (it hosts an MDT two OST using "virtual" disks). We chose LDISKFS and not ZFS. Note that this happens at every boot, well before the clients (AlmaLinux 9.3 or 8.9 VMs) connect and even when the clients are powered off. The network connecting the clients and the server is a "virtual" 10GbE network (of course there is no virtual IB). Also we had the same messages previously with Lustre 2.15.3 using an AlmaLinux 8.8 server and AlmaLinux 8.8 / 9.2 clients (also using VMs). Note also that we compile ourselves the Lustre RPMs from the sources from the git repository. We also chose to use a patched kernel. Our build procedure for RPMs seems to work well because our real cluster run fine on CentOS 7.9 with Lustre 2.12.9 and IB (MOFED) networking. So has anyone seen these messages ? Are they problematic ? If yes, how do we avoid them ? We would like to make sure our small test system using VMs works well before we upgrade our real cluster. Thanks in advance ! Martin Audet _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org