Hoi, As I wrote to another post[1] I failed to upgrade to 1.1.8 for a 2 node cluster.
Before the upgrade process both nodes are using CentOS 6.3, corosync 1.4.1-7 and pacemaker-1.1.7. I followed the rolling upgrade process, so I stopped pacemaker and then corosync on node1 and upgraded to CentOS 6.4. The OS upgrade upgrades also pacemaker to 1.1.8-7 and corosync to 1.4.1-15. The upgrade of rpms went smoothly as I knew about the crmsh issue so I made sure I had crmsh rpm on my repos. Corosync started without any problems and both nodes could see each other[2]. But for some reason node2 failed to receive a reply on join offer from node1 and node1 never joined the cluster. Node1 formed a new cluster as it never got an reply from node2, so I ended up with a split-brain situation. Logs of node1 can be found here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node1.log and of node2 here https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/node2.log I have found this thread[3] which could be related to my problem but the bug which caused the failure on join on that case is solved in 1.1.8. Any ideas? Cheers, Pavlos [1] Subject Different value on cluster-infrastructure between 2 nodes [2] https://dl.dropboxusercontent.com/u/1773878/pacemaker-issue/corosync.status [3] http://comments.gmane.org/gmane.linux.highavailability.pacemaker/13185
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org