I'm seeing quite a few of these errors:

May 2 11:33:29 holy-slurm01 slurmctld[47253]: error: slurm_receive_msg: Zero Bytes were transmitted or received May 2 11:33:29 holy-slurm01 slurmctld[47253]: error: slurm_receive_msg: Zero Bytes were transmitted or received

I know that this can be caused by a node or client that is in a bad state, but I can't figure out how to trace it back to which one. Does anyone have any tricks for tracing this sort of error back? I turned on the Protocol Debug Flag but none of the additional debug statements lead to the culprit.

-Paul Edmon-

Reply via email to