Re: [ClusterLabs] Too quick node reboot leads to failed corosync assert on other node(s)

Jan Friesse Fri, 19 Feb 2016 00:21:07 -0800

Michal Koutný napsal(a):

On 02/18/2016 10:40 AM, Christine Caulfield wrote:

I definitely remember looking into this, or something very like it, ages
ago. I can't find anything in the commit logs for either corosync or
cman that looks relevant though. If you're seeing it on recent builds
then it's obviously still a problem anyway and we ought to look into it!

Thanks for you replies.


So far this happened only once and we've done only "post mortem", alas
no available reproducer. If I have time, I'll try to reproduce it

Ok. Actually I was trying to reproduce and was really not successful(current master). Steps I've used:

- 2 nodes, token set to 30 sec
- execute cpgbench on node2
- pause node1 corosync (ctrl+z), kill node1 corosync (kill -9 %1)
- wait until corosync on node2 move into "entering GATHER
state from..."
- execute corosync on node1

Basically during recovery new node trans list was never send (and/orignored by node2).

I'm going to try test v1.4.7, but it's also possible that bug is fixedby other commits (my favorites are cfbb021e130337603fe5b545d1e377296ecb92ea,4ee84c51fa73c4ec7cbee922111a140a3aaf75df,f135b680967aaef1d466f40170c75ae3e470e147).


Regards,
  Honza

locally and check whether it exists in the current version.

Michal



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Too quick node reboot leads to failed corosync assert on other node(s)

Reply via email to