On Tue, Oct 25, 2011 at 4:08 AM, Proskurin Kirill <k.prosku...@corp.mail.ru> wrote: > Hello. > > corosync-1.4.1 > pacemaker-1.1.5 > pacemaker runs with "ver: 1" > > I run on strange problem. Hope someone can help me. > > I have 9 nodes cluster. All was fine till I need to reboot a node. > After reboot it don`t want to come back to cluster with "not in our > membership" error. > > I happens with other 2 nodes on this cluster. > > Network is fine. > rm -rf /var/lib/heartbeat/crm/* not helps. > > I ask for help at IRC and we do this: > I run one node with debug for few sec and I strace cib process. Both in > links below. > In debug logs we found "cib not connected" error but can`t understand reason > of this. > > Debug logs: http://dl.dropbox.com/u/1932700/corosync.log.debug.gz > cib strace: http://dl.dropbox.com/u/1932700/cib-starce.log.gz > > P.S. I have equal problem on other cluster and "fix" it with shutdown all > nodes(corosync + pacemaker), rm -rf /var/lib/heartbeat/crm/* , startup all > nodes. But it`s not really an option. :-)
removing the contents of /var/lib/heartbeat/crm wont achieve much. its the restart thats getting corosync unstuck > > -- > Best regards, > Proskurin Kirill > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker