Re: [Linux-cluster] Cluster failure, dlm overload

wsfax alu.es Wed, 11 Apr 2012 08:31:27 -0700

Update of the information about this problem.

We see that the loop that causes the overload of "dlm" is:



   1. Node 1 sends a "lookup" message, related to some filesystem and
   inode, to the master node (node 3), asking for the current owner of this
   element.
   2. Node 3 replies "the owner of this element is now the node 4".
   3. Node 1 sends a "request" message to node 4.
   4. Node 4 replies "I have not it" (error code EBADR = -53).
   5. goto step 1

This loop appends several hundreds per seconds, multiplied by all
filesystem and inodes with this problem. In total, several tenths of
thousands messages in DLM, until restart of the cluster

Kind regards.

--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] Cluster failure, dlm overload

Reply via email to