Have you tried lfs check servers on the login node? Sent from my iPhone
On Oct 11, 2021, at 2:58 AM, Sid Young via lustre-discuss <lustre-discuss@lists.lustre.org> wrote: I'm having trouble diagnosing where the problem lies in my Lustre installation, clients are 2.12.6 and I have a /home and /lustre filesystems using Lustre. /home has 4 OSTs and /lustre is made up of 6 OSTs. lfs df shows all OSTs as ACTIVE. The /lustre file system appears fine, I can ls into every directory. When people log into the login node, it appears to lockup. I have shut down everything and remounted the OSTs and MDTs etc in order with no errors reporting but I'm getting the lockup issue soon after a few people log in. The backend network is 100G Ethernet using ConnectX5 cards and the OS is Cento 7.9, everything was installed as RPMs and updates are disabled in yum.conf Two questions to start with: Is there a command line tool to check each OST individually? Apart from /var/log/messages, is there a lustre specific log I can monitor on the login node to see errors when I hit /home... Sid Young _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org