Re: [Linux-cluster] CS5 / about loop "Node is undead"

Lon Hohberger Mon, 09 Jun 2008 13:25:49 -0700

On Wed, 2008-06-04 at 14:47 +0200, Alain Moulle wrote:
> Hi
> 
> About my problem of node entering a loop :
> Jun  3 15:54:49 [EMAIL PROTECTED] qdiskd[22256]: <notice> Writing eviction 
> notice for node 1
> Jun  3 15:54:50 [EMAIL PROTECTED] qdiskd[22256]: <notice> Node 1 evicted
> Jun  3 15:54:51 [EMAIL PROTECTED] qdiskd[22256]: <crit> Node 1 is undead.
> 
> I notice that just before entering this loop, I have a message :
> Jun  3 15:54:47 [EMAIL PROTECTED] fenced[22327]: fencing node "xn1"
> Jun  3 15:54:48 [EMAIL PROTECTED] qdiskd[22256]: <info> Assuming master role
> 
> but never the message :
> Jun  3 15:54:47 [EMAIL PROTECTED] fenced[22327]: fence "xn1" success
> 
> Nethertheless, the service of xn1 is well failovered by xn2, but
> then after the reboot of xn1, we can't start again the CS5 due
> to the problem of infernal loop "Node is undead" on xn2.
> 
> whereas when it works correctly, both messages :
> fencing node "xn1"
> fence "xn1" success
> are successive (after about 30s)
> 
> So my question is : could this pb of infernal loop "Node is undead"
> be systematically due to a failed fencing phase of xn2 towards xn1 ?
> 
> PS: note that I have applied patch :
> http://sources.redhat.com/git/?p=cluster.git;a=commit;h=b2686ffe984c517110b949d604c54a71800b67c9


Yes.  If qdiskd thinks the node is dead and the node started writing to
the disk again (which is what fencing should prevent), it will display
those messages.

-- Lon


--
Linux-cluster mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/linux-cluster

Re: [Linux-cluster] CS5 / about loop "Node is undead"

Reply via email to