Hi! I thought to give an early warning / cry for help in case others are facing similar issues.
Coincidence or not, but our lustre setup has become unstable soon after starting to migrate nodes from RHEL9.5 to RHEL9.6. The key symptom is high load on metadata servers, processes like ldlm_cn03_017 take all available CPU time. Also memory hogging happened yesterday, which crashed the servers totally. The processes are distributed lock kernel "daemon"s. Best regards, -- - Simppa - Mr. Simppa Äkäslompolo High performance computing specialist Doctor of Science (Tech.) Aalto Scientific Computing School of Science, Aalto University, Finland +358-50-5311327 https://scicomp.aalto.fi/ _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
