On Jul 26, 2019, at 04:28, Thomas Roth <t.r...@gsi.de<mailto:t.r...@gsi.de>> 
wrote:

Hi all,

this morning one of our MDT went 'unhealthy',

Jul 26 10:15:13 lxmds20 kernel: LustreError: 
9510:0:(service.c:3285:ptlrpc_svcpt_health_check())
mdt: unhealthy - request has been waiting 1017s

However, somewhat later,

lxmds20:~# cat /sys/fs/lustre/health_check
healthy

and all Lustre operations seem to be good, too.

This means that some RPC has been stuck, but if the RPC eventually completes 
then there is no reason for the MDS to be "unhealthy" anymore.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to