subject:"\[Linux\-cluster\] \[cman\] cant joint cluster after reboot"

Re: [Linux-cluster] [cman] cant joint cluster after reboot

2013-11-08 Thread Yuriy Demchenko

Thanks, the problem indeed in multicast. Switching to udpu brought 
cluster to normal operation.


Any tips how to fix multicast operation? igmp snooping on switch is 
disabled, firewall disabled too.
In fact, what confuses me is that node cant join cluster after reboot no 
matter how long i'll wait after reboot, no matter how many times i'll do 
"service cman restart" on that node - it just dont work until cman 
restarted on some other node.
Another strange thing - i've used tcpdump to capture udp traffic and 
there were no udp traffic at all from node-1 after reboot, no traffic 
after service restarts. But as soon as service restarted on other node - 
udp traffic to multicast address appeared from node-1.


I've also tried to switch igmp snooping on, but that caused cluster not 
working at all - each node saw only itself. On switch I saw that 
multicast group was created, each corresponded port became member of 
that group, but packet statistics shown only few "report v3" packets, no 
query/leave/error packets.



Yuriy Demchenko

On 11/07/2013 05:47 PM, Christine Caulfield wrote:

On 07/11/13 12:04, Yuriy Demchenko wrote:

Hi,

I'm trying to set up 3-node cluster (2 nodes + 1 standby node for
quorum) with cman+pacemaker stack, everything according this quickstart
article: http://clusterlabs.org/quickstart-redhat.html

Cluster starts, all nodes see each other, quorum gained, stonith
working, but I've run into problem with cman: node cant join cluster
after reboot - cman starts and cman_tool nodes reports only that node as
cluster-member, while on other 2 nodes it reports 2 nodes as
cluster-member and 3rd as offline. cman stop/start/restart on the
problem node does no effect - it still can see only itself, but if i'll
do cman restart on one of working nodes - everything goes back to
normal, all 3 nodes joins the cluster and subsequent cman service
restarts on any nodes works fine - node lefts cluster and rejoins
sucessfully. But again - only till node OS reboot.

For example:
[1] Working cluster:

[root@node-1 ~]# cman_tool nodes
Node  Sts   Inc   Joined   Name
   1   M592   2013-11-07 15:20:54  node-1.spb.stone.local
   2   M760   2013-11-07 15:20:54  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 760
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

Picture is same on all 3 nodes (except for node name and id) - same
cluster name, cluster id, multicast addres.

[2] I've put node-1 into reboot. After reboot complete, "cman_tool
nodes" on node-2 and vnode-3 shows this:

Node  Sts   Inc   Joined Name
   1   X760node-1.spb.stone.local
   2   M588   2013-11-07 15:11:23  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-2 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 764
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-2.spb.stone.local
Node ID: 2
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.22

But, on rebooted node-1 it shows this:

Node  Sts   Inc   Joined Name
   1   M764   2013-11-07 15:49:01  node-1.spb.stone.local
   2   X  0node-2.spb.stone.local
   3   X  0vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 776
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

so, same cluster name, cluster id, multicast address - but it cant see
other nodes. And there are nothing in /var/log/messages and
/var/log/cluster/corosync.log on other two nodes - they seem not notice
node-1 coming back online at all, last records about node-1 leaving
cluster.

[3] If now i do "service cman restart" on node-2 or vnode-3 - everything
goes back to normal operation as in [1]
in logs it shows as node-2 leaving cluster (service stop) and
simultaneously joining of both node-2 and node-1 (service start)

Nov  7 11:47:06 vnode-3 corosync[26692]: [QUORUM] Members[2]: 2 3
Nov  7 11:47:06 vnode-3 corosync[26692]:   [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Nov  7 11:47:06 vnode-3 kernel: dlm: closing connection to

Re: [Linux-cluster] [cman] cant joint cluster after reboot

2013-11-07 Thread Christine Caulfield


On 07/11/13 12:04, Yuriy Demchenko wrote:

Hi,

I'm trying to set up 3-node cluster (2 nodes + 1 standby node for
quorum) with cman+pacemaker stack, everything according this quickstart
article: http://clusterlabs.org/quickstart-redhat.html

Cluster starts, all nodes see each other, quorum gained, stonith
working, but I've run into problem with cman: node cant join cluster
after reboot - cman starts and cman_tool nodes reports only that node as
cluster-member, while on other 2 nodes it reports 2 nodes as
cluster-member and 3rd as offline. cman stop/start/restart on the
problem node does no effect - it still can see only itself, but if i'll
do cman restart on one of working nodes - everything goes back to
normal, all 3 nodes joins the cluster and subsequent cman service
restarts on any nodes works fine - node lefts cluster and rejoins
sucessfully. But again - only till node OS reboot.

For example:
[1] Working cluster:

[root@node-1 ~]# cman_tool nodes
Node  Sts   Inc   Joined   Name
   1   M592   2013-11-07 15:20:54  node-1.spb.stone.local
   2   M760   2013-11-07 15:20:54  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 760
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

Picture is same on all 3 nodes (except for node name and id) - same
cluster name, cluster id, multicast addres.

[2] I've put node-1 into reboot. After reboot complete, "cman_tool
nodes" on node-2 and vnode-3 shows this:

Node  Sts   Inc   Joined   Name
   1   X760node-1.spb.stone.local
   2   M588   2013-11-07 15:11:23  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-2 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 764
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-2.spb.stone.local
Node ID: 2
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.22

But, on rebooted node-1 it shows this:

Node  Sts   Inc   Joined   Name
   1   M764   2013-11-07 15:49:01  node-1.spb.stone.local
   2   X  0node-2.spb.stone.local
   3   X  0vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 776
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

so, same cluster name, cluster id, multicast address - but it cant see
other nodes. And there are nothing in /var/log/messages and
/var/log/cluster/corosync.log on other two nodes - they seem not notice
node-1 coming back online at all, last records about node-1 leaving
cluster.

[3] If now i do "service cman restart" on node-2 or vnode-3 - everything
goes back to normal operation as in [1]
in logs it shows as node-2 leaving cluster (service stop) and
simultaneously joining of both node-2 and node-1 (service start)

Nov  7 11:47:06 vnode-3 corosync[26692]: [QUORUM] Members[2]: 2 3
Nov  7 11:47:06 vnode-3 corosync[26692]:   [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Nov  7 11:47:06 vnode-3 kernel: dlm: closing connection to node 1
Nov  7 11:47:06 vnode-3 corosync[26692]:   [CPG   ] chosen downlist:
sender r(0) ip(192.168.220.22) ; members(old:3 left:1)
Nov  7 11:47:06 vnode-3 corosync[26692]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Nov  7 11:53:28 vnode-3 corosync[26692]:   [QUORUM] Members[1]: 3
Nov  7 11:53:28 vnode-3 corosync[26692]:   [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Nov  7 11:53:28 vnode-3 corosync[26692]:   [CPG   ] chosen downlist:
sender r(0) ip(192.168.220.14) ; members(old:2 left:1)
Nov  7 11:53:28 vnode-3 corosync[26692]:   [MAIN  ] Completed service
synchronization, ready to provide service.
Nov  7 11:53:28 vnode-3 kernel: dlm: closing connection to node 2
Nov  7 11:53:30 vnode-3 corosync[26692]:   [TOTEM ] A processor joined
or left the membership and a new membership was formed.
Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[2]: 1 3
Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[2]: 1 3
Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[3]: 1 2

Re: [Linux-cluster] [cman] cant joint cluster after reboot

2013-11-07 Thread Yuriy Demchenko

Nope, nothing in logs suggests that node is fenced while in reboot. 
Moreover, same behaviour persists with pacemaker started - and I've 
explicitly put node into standby in pacemaker before reboot.
And same behaviour persists with stonith-enabled=false; same behaviour 
with manual node fence via "stonith_admin --reboot 
node-1.spb.stone.local". So i suppose fencing isn't issue here.


Yuriy Demchenko

On 11/07/2013 05:11 PM, Vishesh kumar wrote:
My understanding is node fenced while rebooting. I suggest you to look 
info fencing logs as well. If your fencing logs not in detail use 
following in cluster.conf to enable logging



  
   

Thanks


On Thu, Nov 7, 2013 at 5:34 PM, Yuriy Demchenko 
mailto:demchenko...@gmail.com>> wrote:


Hi,

I'm trying to set up 3-node cluster (2 nodes + 1 standby node for
quorum) with cman+pacemaker stack, everything according this
quickstart article: http://clusterlabs.org/quickstart-redhat.html

Cluster starts, all nodes see each other, quorum gained, stonith
working, but I've run into problem with cman: node cant join
cluster after reboot - cman starts and cman_tool nodes reports
only that node as cluster-member, while on other 2 nodes it
reports 2 nodes as cluster-member and 3rd as offline. cman
stop/start/restart on the problem node does no effect - it still
can see only itself, but if i'll do cman restart on one of working
nodes - everything goes back to normal, all 3 nodes joins the
cluster and subsequent cman service restarts on any nodes works
fine - node lefts cluster and rejoins sucessfully. But again -
only till node OS reboot.

For example:
[1] Working cluster:

[root@node-1 ~]# cman_tool nodes
Node  Sts   Inc   Joined   Name
   1   M592   2013-11-07 15:20:54  node-1.spb.stone.local
   2   M760   2013-11-07 15:20:54  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 760
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

Picture is same on all 3 nodes (except for node name and id) -
same cluster name, cluster id, multicast addres.

[2] I've put node-1 into reboot. After reboot complete, "cman_tool
nodes" on node-2 and vnode-3 shows this:

Node  Sts   Inc   Joined   Name
   1   X760  node-1.spb.stone.local
   2   M588   2013-11-07 15:11:23  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-2 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 764
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-2.spb.stone.local
Node ID: 2
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.22

But, on rebooted node-1 it shows this:

Node  Sts   Inc   Joined   Name
   1   M764   2013-11-07 15:49:01  node-1.spb.stone.local
   2   X  0  node-2.spb.stone.local
   3   X  0  vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 776
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21

so, same cluster name, cluster id, multicast address - but it cant
see other nodes. And there are nothing in /var/log/messages and
/var/log/cluster/corosync.log on other two nodes - they seem not
notice node-1 coming back online at all, last records about node-1
leaving cluster.

[3] If now i do "service cman restart" on node-2 or vnode-3 -
everything goes back to normal operation as in [1]
in logs it shows as node-2 leaving cluster (service stop) and
simultaneously joining of both node

Re: [Linux-cluster] [cman] cant joint cluster after reboot

2013-11-07 Thread Vishesh kumar

My understanding is node fenced while rebooting. I suggest you to look info
fencing logs as well. If your fencing logs not in detail use following in
cluster.conf to enable logging


 
  


Thanks


On Thu, Nov 7, 2013 at 5:34 PM, Yuriy Demchenko wrote:

> Hi,
>
> I'm trying to set up 3-node cluster (2 nodes + 1 standby node for quorum)
> with cman+pacemaker stack, everything according this quickstart article:
> http://clusterlabs.org/quickstart-redhat.html
>
> Cluster starts, all nodes see each other, quorum gained, stonith working,
> but I've run into problem with cman: node cant join cluster after reboot -
> cman starts and cman_tool nodes reports only that node as cluster-member,
> while on other 2 nodes it reports 2 nodes as cluster-member and 3rd as
> offline. cman stop/start/restart on the problem node does no effect - it
> still can see only itself, but if i'll do cman restart on one of working
> nodes - everything goes back to normal, all 3 nodes joins the cluster and
> subsequent cman service restarts on any nodes works fine - node lefts
> cluster and rejoins sucessfully. But again - only till node OS reboot.
>
> For example:
> [1] Working cluster:
>
>> [root@node-1 ~]# cman_tool nodes
>> Node  Sts   Inc   Joined   Name
>>1   M592   2013-11-07 15:20:54  node-1.spb.stone.local
>>2   M760   2013-11-07 15:20:54  node-2.spb.stone.local
>>3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
>> [root@node-1 ~]# cman_tool status
>> Version: 6.2.0
>> Config Version: 10
>> Cluster Name: ocluster
>> Cluster Id: 2059
>> Cluster Member: Yes
>> Cluster Generation: 760
>> Membership state: Cluster-Member
>> Nodes: 3
>> Expected votes: 3
>> Total votes: 3
>> Node votes: 1
>> Quorum: 2
>> Active subsystems: 7
>> Flags:
>> Ports Bound: 0
>> Node name: node-1.spb.stone.local
>> Node ID: 1
>> Multicast addresses: 239.192.8.19
>> Node addresses: 192.168.220.21
>>
> Picture is same on all 3 nodes (except for node name and id) - same
> cluster name, cluster id, multicast addres.
>
> [2] I've put node-1 into reboot. After reboot complete, "cman_tool nodes"
> on node-2 and vnode-3 shows this:
>
>> Node  Sts   Inc   Joined   Name
>>1   X760node-1.spb.stone.local
>>2   M588   2013-11-07 15:11:23  node-2.spb.stone.local
>>3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
>> [root@node-2 ~]# cman_tool status
>> Version: 6.2.0
>> Config Version: 10
>> Cluster Name: ocluster
>> Cluster Id: 2059
>> Cluster Member: Yes
>> Cluster Generation: 764
>> Membership state: Cluster-Member
>> Nodes: 2
>> Expected votes: 3
>> Total votes: 2
>> Node votes: 1
>> Quorum: 2
>> Active subsystems: 7
>> Flags:
>> Ports Bound: 0
>> Node name: node-2.spb.stone.local
>> Node ID: 2
>> Multicast addresses: 239.192.8.19
>> Node addresses: 192.168.220.22
>>
> But, on rebooted node-1 it shows this:
>
>> Node  Sts   Inc   Joined   Name
>>1   M764   2013-11-07 15:49:01  node-1.spb.stone.local
>>2   X  0node-2.spb.stone.local
>>3   X  0vnode-3.spb.stone.local
>> [root@node-1 ~]# cman_tool status
>> Version: 6.2.0
>> Config Version: 10
>> Cluster Name: ocluster
>> Cluster Id: 2059
>> Cluster Member: Yes
>> Cluster Generation: 776
>> Membership state: Cluster-Member
>> Nodes: 1
>> Expected votes: 3
>> Total votes: 1
>> Node votes: 1
>> Quorum: 2 Activity blocked
>> Active subsystems: 7
>> Flags:
>> Ports Bound: 0
>> Node name: node-1.spb.stone.local
>> Node ID: 1
>> Multicast addresses: 239.192.8.19
>> Node addresses: 192.168.220.21
>>
> so, same cluster name, cluster id, multicast address - but it cant see
> other nodes. And there are nothing in /var/log/messages and
> /var/log/cluster/corosync.log on other two nodes - they seem not notice
> node-1 coming back online at all, last records about node-1 leaving cluster.
>
> [3] If now i do "service cman restart" on node-2 or vnode-3 - everything
> goes back to normal operation as in [1]
> in logs it shows as node-2 leaving cluster (service stop) and
> simultaneously joining of both node-2 and node-1 (service start)
>
>> Nov  7 11:47:06 vnode-3 corosync[26692]: [QUORUM] Members[2]: 2 3
>> Nov  7 11:47:06 vnode-3 corosync[26692]:   [TOTEM ] A processor joined or
>> left the membership and a new membership was formed.
>> Nov  7 11:47:06 vnode-3 kernel: dlm: closing connection to node 1
>> Nov  7 11:47:06 vnode-3 corosync[26692]:   [CPG   ] chosen downlist:
>> sender r(0) ip(192.168.220.22) ; members(old:3 left:1)
>> Nov  7 11:47:06 vnode-3 corosync[26692]:   [MAIN  ] Completed service
>> synchronization, ready to provide service.
>> Nov  7 11:53:28 vnode-3 corosync[26692]:   [QUORUM] Members[1]: 3
>> Nov  7 11:53:28 vnode-3 corosync[26692]:   [TOTEM ] A processor joined or
>> left the membership and a new membership was formed.
>> Nov  7 11:53:28 vnode-3 corosync[26692]:   [CPG   ] chosen downlist:
>> sender r(0) i

[Linux-cluster] [cman] cant joint cluster after reboot

2013-11-07 Thread Yuriy Demchenko


Hi,

I'm trying to set up 3-node cluster (2 nodes + 1 standby node for 
quorum) with cman+pacemaker stack, everything according this quickstart 
article: http://clusterlabs.org/quickstart-redhat.html


Cluster starts, all nodes see each other, quorum gained, stonith 
working, but I've run into problem with cman: node cant join cluster 
after reboot - cman starts and cman_tool nodes reports only that node as 
cluster-member, while on other 2 nodes it reports 2 nodes as 
cluster-member and 3rd as offline. cman stop/start/restart on the 
problem node does no effect - it still can see only itself, but if i'll 
do cman restart on one of working nodes - everything goes back to 
normal, all 3 nodes joins the cluster and subsequent cman service 
restarts on any nodes works fine - node lefts cluster and rejoins 
sucessfully. But again - only till node OS reboot.


For example:
[1] Working cluster:

[root@node-1 ~]# cman_tool nodes
Node  Sts   Inc   Joined   Name
   1   M592   2013-11-07 15:20:54  node-1.spb.stone.local
   2   M760   2013-11-07 15:20:54  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 760
Membership state: Cluster-Member
Nodes: 3
Expected votes: 3
Total votes: 3
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21
Picture is same on all 3 nodes (except for node name and id) - same 
cluster name, cluster id, multicast addres.


[2] I've put node-1 into reboot. After reboot complete, "cman_tool 
nodes" on node-2 and vnode-3 shows this:

Node  Sts   Inc   Joined   Name
   1   X760node-1.spb.stone.local
   2   M588   2013-11-07 15:11:23  node-2.spb.stone.local
   3   M760   2013-11-07 15:20:54  vnode-3.spb.stone.local
[root@node-2 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 764
Membership state: Cluster-Member
Nodes: 2
Expected votes: 3
Total votes: 2
Node votes: 1
Quorum: 2
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-2.spb.stone.local
Node ID: 2
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.22

But, on rebooted node-1 it shows this:

Node  Sts   Inc   Joined   Name
   1   M764   2013-11-07 15:49:01  node-1.spb.stone.local
   2   X  0node-2.spb.stone.local
   3   X  0vnode-3.spb.stone.local
[root@node-1 ~]# cman_tool status
Version: 6.2.0
Config Version: 10
Cluster Name: ocluster
Cluster Id: 2059
Cluster Member: Yes
Cluster Generation: 776
Membership state: Cluster-Member
Nodes: 1
Expected votes: 3
Total votes: 1
Node votes: 1
Quorum: 2 Activity blocked
Active subsystems: 7
Flags:
Ports Bound: 0
Node name: node-1.spb.stone.local
Node ID: 1
Multicast addresses: 239.192.8.19
Node addresses: 192.168.220.21
so, same cluster name, cluster id, multicast address - but it cant see 
other nodes. And there are nothing in /var/log/messages and 
/var/log/cluster/corosync.log on other two nodes - they seem not notice 
node-1 coming back online at all, last records about node-1 leaving cluster.


[3] If now i do "service cman restart" on node-2 or vnode-3 - everything 
goes back to normal operation as in [1]
in logs it shows as node-2 leaving cluster (service stop) and 
simultaneously joining of both node-2 and node-1 (service start)

Nov  7 11:47:06 vnode-3 corosync[26692]: [QUORUM] Members[2]: 2 3
Nov  7 11:47:06 vnode-3 corosync[26692]:   [TOTEM ] A processor joined 
or left the membership and a new membership was formed.

Nov  7 11:47:06 vnode-3 kernel: dlm: closing connection to node 1
Nov  7 11:47:06 vnode-3 corosync[26692]:   [CPG   ] chosen downlist: 
sender r(0) ip(192.168.220.22) ; members(old:3 left:1)
Nov  7 11:47:06 vnode-3 corosync[26692]:   [MAIN  ] Completed service 
synchronization, ready to provide service.

Nov  7 11:53:28 vnode-3 corosync[26692]:   [QUORUM] Members[1]: 3
Nov  7 11:53:28 vnode-3 corosync[26692]:   [TOTEM ] A processor joined 
or left the membership and a new membership was formed.
Nov  7 11:53:28 vnode-3 corosync[26692]:   [CPG   ] chosen downlist: 
sender r(0) ip(192.168.220.14) ; members(old:2 left:1)
Nov  7 11:53:28 vnode-3 corosync[26692]:   [MAIN  ] Completed service 
synchronization, ready to provide service.

Nov  7 11:53:28 vnode-3 kernel: dlm: closing connection to node 2
Nov  7 11:53:30 vnode-3 corosync[26692]:   [TOTEM ] A processor joined 
or left the membership and a new membership was formed.

Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[2]: 1 3
Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[2]: 1 3
Nov  7 11:53:30 vnode-3 corosync[26692]:   [QUORUM] Members[3]: 1 2 3
Nov  7 11:

Re: [Linux-cluster] [cman] cant joint cluster after reboot

Re: [Linux-cluster] [cman] cant joint cluster after reboot

Re: [Linux-cluster] [cman] cant joint cluster after reboot

Re: [Linux-cluster] [cman] cant joint cluster after reboot

[Linux-cluster] [cman] cant joint cluster after reboot

5 matches

Site Navigation

Mail list logo

Footer information