On Wed, 2008-06-04 at 14:47 +0200, Alain Moulle wrote: > Hi > > About my problem of node entering a loop : > Jun 3 15:54:49 [EMAIL PROTECTED] qdiskd[22256]: <notice> Writing eviction > notice for node 1 > Jun 3 15:54:50 [EMAIL PROTECTED] qdiskd[22256]: <notice> Node 1 evicted > Jun 3 15:54:51 [EMAIL PROTECTED] qdiskd[22256]: <crit> Node 1 is undead. > > I notice that just before entering this loop, I have a message : > Jun 3 15:54:47 [EMAIL PROTECTED] fenced[22327]: fencing node "xn1" > Jun 3 15:54:48 [EMAIL PROTECTED] qdiskd[22256]: <info> Assuming master role > > but never the message : > Jun 3 15:54:47 [EMAIL PROTECTED] fenced[22327]: fence "xn1" success > > Nethertheless, the service of xn1 is well failovered by xn2, but > then after the reboot of xn1, we can't start again the CS5 due > to the problem of infernal loop "Node is undead" on xn2. > > whereas when it works correctly, both messages : > fencing node "xn1" > fence "xn1" success > are successive (after about 30s) > > So my question is : could this pb of infernal loop "Node is undead" > be systematically due to a failed fencing phase of xn2 towards xn1 ? > > PS: note that I have applied patch : > http://sources.redhat.com/git/?p=cluster.git;a=commit;h=b2686ffe984c517110b949d604c54a71800b67c9
Yes. If qdiskd thinks the node is dead and the node started writing to the disk again (which is what fencing should prevent), it will display those messages. -- Lon -- Linux-cluster mailing list [email protected] https://www.redhat.com/mailman/listinfo/linux-cluster
