Le 28/10/2010 17:55, Pavlos Parissis a écrit :
On 28 October 2010 16:09, Guillaume Chanaud
<guillaume.chan...@connecting-nature.com> wrote:
Hello,
i have a cluster of two master/slave drbd server running into a vlan
(machines are dedicated servers)
(filer1 and filer2)
I added a third node to the cluster (a "blank node" for the moment)
correctly
(server1)
When i add a 4th node to the cluster (which is a "mirror" of server1)
(server2)
this node start as standalone...Here is the message.log :
Oct 28 15:59:27 ns209045 corosync[16543]: [TOTEM ] A processor joined or
left the membership and a new membership was formed.
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 945392: memb=1,
new=0, lost=0
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
memb: server2 16820416
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 945392: memb=1, new=0,
lost=0
Oct 28 15:59:28 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
MEMB: server2 16820416
Oct 28 15:59:28 ns209045 corosync[16543]: [TOTEM ] A processor joined or
left the membership and a new membership was formed.
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] notice:
pcmk_peer_update: Transitional membership event on ring 945416: memb=1,
new=0, lost=0
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
memb: server2 16820416
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] notice:
pcmk_peer_update: Stable membership event on ring 945416: memb=1, new=0,
lost=0
Oct 28 15:59:29 ns209045 corosync[16543]: [pcmk ] info: pcmk_peer_update:
MEMB: server2 16820416
[...] Message repeat many many times
Now i stop the server1, and i start the server2...server2 start correctly
and is added to the cluster...but when
i want to start server1, same thing happens...(so things are inverted but
result is the same...when i start one the serverX, the other can't start...)
My corosync.conf is configured in broadcast, not multicast....I have lots of
problem with multicast because lots of briged VM on the vlan
doesn't see the multicast packets, or doesn't join the multicast group
correctly...
Any hint on this ??
corosync and auth files are the same on server2?
Yes of course :D (copied by scp), as i told server1 can join when
server2 is offline, and server 2 can join when server1 is offline, but
if one is online, the other can't join and log the above things in loop...
In fact i have loooooooottttttssssss of problem with
corosync/pacemaker...multicast/broadcast between physical
servers/virtual....lots of different shit everywhere, error log are
always different depending on what i try...
The strange things is that the filer1 filer2 server2 and server1 are all
running the same distro (gentoo) with same tools and are on the same
vlan (which is working for lots of services like nfs...)
_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker