Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-14 Thread Christine Caulfield

On 14/11/14 11:01, Daniel Dehennin wrote:

Christine Caulfield  writes:


[...]


If its only happening at startup it could be the switch/router
learning the ports for the nodes and building its routing
tables. Switching to udpu will then get rid of the message if it's
annoying


Switching to updu make it works correctly.



Ahh that's good. It sounds like it was something multicast related (if 
not exactly what I thought it might have been) ... these things usually are!


Chrissie


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-14 Thread Daniel Dehennin
Christine Caulfield  writes:


[...]

> If its only happening at startup it could be the switch/router
> learning the ports for the nodes and building its routing
> tables. Switching to udpu will then get rid of the message if it's
> annoying

Switching to updu make it works correctly.

Thanks.
-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF


signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-14 Thread Daniel Dehennin
Christine Caulfield  writes:


[...]

> If its only happening at startup it could be the switch/router
> learning the ports for the nodes and building its routing
> tables. Switching to udpu will then get rid of the message if it's
> annoying

When I start the corosync process of the VM, nothing happen in logs
execpt the new membership, here are logs for two bare metal hosts
(nebula3 is DC):

#+begin_src
Nov 14 08:29:29 nebula1 corosync[4242]:   [TOTEM ] A new membership 
(192.168.231.70:80696) was formed. Members joined: 108489
Nov 14 08:29:29 nebula1 corosync[4242]:   [QUORUM] Members[5]: 1084811078 
1084811079 1084811080 108488 108489
Nov 14 08:29:29 nebula1 pacemakerd[4251]:   notice: crm_update_peer_state: 
pcmk_quorum_notification: Node one-frontend[108489] - state is now member 
(was lost)
Nov 14 08:29:29 nebula1 crmd[4258]:   notice: crm_update_peer_state: 
pcmk_quorum_notification: Node one-frontend[108489] - state is now member 
(was lost)
Nov 14 08:29:29 nebula1 corosync[4242]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
#+end_src

#+begin_src
Nov 14 08:29:29 nebula3 corosync[5345]:   [TOTEM ] A new membership 
(192.168.231.70:80696) was formed. Members joined: 108489
Nov 14 08:29:29 nebula3 corosync[5345]:   [QUORUM] Members[5]: 1084811078 
1084811079 1084811080 108488 108489
Nov 14 08:29:29 nebula3 corosync[5345]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Nov 14 08:29:29 nebula3 crmd[5423]:   notice: crm_update_peer_state: 
pcmk_quorum_notification: Node one-frontend[108489] - state is now member 
(was lost)
Nov 14 08:29:29 nebula3 pacemakerd[5416]:   notice: crm_update_peer_state: 
pcmk_quorum_notification: Node one-frontend[108489] - state is now member 
(was lost)
#+end_src

I let the corosync process for some time and then start pacemaker, the
logs start when pacemaker starts:

#+begin_src
Nov 14 09:26:30 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 
Nov 14 09:26:30 nebula1 corosync[4242]: message repeated 8 times: [   [TOTEM ] 
Retransmit List: 24 ]
Nov 14 09:26:30 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 
Nov 14 09:26:46 nebula1 corosync[4242]: message repeated 101 times: [   [TOTEM 
] Retransmit List: 24 25 26 ]
Nov 14 09:26:46 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 
Nov 14 09:26:46 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 
Nov 14 09:26:46 nebula1 corosync[4242]: message repeated 44 times: [   [TOTEM ] 
Retransmit List: 24 25 26 28 29 2a 2b ]
Nov 14 09:26:46 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 
Nov 14 09:26:46 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 
Nov 14 09:26:47 nebula1 corosync[4242]: message repeated 35 times: [   [TOTEM ] 
Retransmit List: 24 25 26 28 29 2a 2b 34 35 36 38 ]
Nov 14 09:26:47 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 
Nov 14 09:26:47 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 
Nov 14 09:26:48 nebula1 corosync[4242]: message repeated 43 times: [   [TOTEM ] 
Retransmit List: 24 25 26 28 29 2a 2b 34 35 36 38 3a 3b 3c ]
Nov 14 09:26:48 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 
Nov 14 09:26:48 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 41 42 
Nov 14 09:27:08 nebula1 crmd[4258]:   notice: do_state_transition: State 
transition S_NOT_DC -> S_ELECTION [ input=I_ELECTION cause=C_FSA_INTERNAL 
origin=do_election_count_vote ]
Nov 14 09:27:08 nebula1 crmd[4258]:   notice: do_state_transition: State 
transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL 
origin=do_election_count_vote ]
Nov 14 09:27:08 nebula1 corosync[4242]: message repeated 119 times: [   [TOTEM 
] Retransmit List: 24 25 26 28 29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 41 42 ]
Nov 14 09:27:08 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 41 42 44 45 46 
Nov 14 09:27:08 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 41 42 44 45 46 47 48 49 4a 4b 4c 4d 4e 
Nov 14 09:27:08 nebula1 corosync[4242]:   [TOTEM ] Retransmit List: 24 25 26 28 
29 2a 2b 34 35 36 38 3a 3b 3c 3f 40 41 42 44 45 46 47 48 49 4a 4b 4c 4d 4e 4f 
Nov 14 09:27:28 nebula1 crmd[4258]:  warning: do_log: FSA: Input I_DC_TIMEOUT 
from crm_timer_popped() received in state S_PENDING
Nov 14 09:27:28 nebula1 crmd[4258]:   notice: do_state_transition: State 
transition S_ELECTION -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL 
origin=do_election_count_vote ]
Nov 14 09:27:48 nebula1 crmd[4258]:  warning: do_log: FSA: Input I_DC_TIMEOUT 
from crm_timer_popped() received in state S_PENDING
Nov 14 09:27:48 nebula1 crmd[4258]:   notice: do_state

Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-14 Thread Christine Caulfield

On 13/11/14 17:54, Daniel Dehennin wrote:

Digimer  writes:


This generally happens if the network is slow or congested. It is
corosync saying it needs to resend some messages. It is not uncommon
for it to happen now and then, but that is a fairly large amount of
retransmits.


Thanks for the explanation.


Is your network slow or saturated often? It might be that the traffic
from the join is enough to push a congested network to the edge.


Not really:

- two physical hosts with:
   + one 1Gb/s network card for OS (corosync network)
   + three 1Gb/s network cards in LACP bonding included in an Open
 vSwitch

- one physical host with:
   + one 10Gb/s network card for the OS (corosync network)
   + three 10Gb/s network cards in LACP bonding included in an Open
 vSwitch

- one KVM guest (quorum node) with:
   + one virtio card (corosync network)

- one KVM guest with:
   + one virtio card for service (OpenNebula web frontend)
   + one virtio card for corosync communications

With tcpdump I can see packets flying around on all nodes, but it looks
like there is something with my two cards KVM guest, when I start
pacemaker on it I begin to see Retransmit messages in other nodes logs.

Is there a may to know which nodes is responsible of the resend of
theses messages?



If its only happening at startup it could be the switch/router learning 
the ports for the nodes and building its routing tables. Switching to 
udpu will then get rid of the message if it's annoying


Chrissie




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-13 Thread Daniel Dehennin
Digimer  writes:

> This generally happens if the network is slow or congested. It is
> corosync saying it needs to resend some messages. It is not uncommon
> for it to happen now and then, but that is a fairly large amount of
> retransmits.

Thanks for the explanation.

> Is your network slow or saturated often? It might be that the traffic
> from the join is enough to push a congested network to the edge.

Not really:

- two physical hosts with:
  + one 1Gb/s network card for OS (corosync network)
  + three 1Gb/s network cards in LACP bonding included in an Open
vSwitch

- one physical host with:
  + one 10Gb/s network card for the OS (corosync network)
  + three 10Gb/s network cards in LACP bonding included in an Open
vSwitch
  
- one KVM guest (quorum node) with:
  + one virtio card (corosync network)

- one KVM guest with:
  + one virtio card for service (OpenNebula web frontend)
  + one virtio card for corosync communications

With tcpdump I can see packets flying around on all nodes, but it looks
like there is something with my two cards KVM guest, when I start
pacemaker on it I begin to see Retransmit messages in other nodes logs.

Is there a may to know which nodes is responsible of the resend of
theses messages?

Regards.
-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF


signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-13 Thread Digimer
This generally happens if the network is slow or congested. It is 
corosync saying it needs to resend some messages. It is not uncommon for 
it to happen now and then, but that is a fairly large amount of retransmits.


Is your network slow or saturated often? It might be that the traffic 
from the join is enough to push a congested network to the edge.


On 13/11/14 08:07 AM, Daniel Dehennin wrote:

Hello,

My cluster seems to works correctly but when I start corosync and
pacemaker on one of them[1] I start to have some TOTEM logs like this:

#+begin_src
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 46 47 48 49 
4a 4b 4c 4d 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4a 4b 4c 4d 
4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
#+end_src

I do not understand what happens, do you have any hints?

Regards.

Footnotes:
[1]  the VM using two cards 
http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022962.html



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] TOTEM Retransmit list in logs when a node gets up

2014-11-13 Thread Daniel Dehennin
Hello,

My cluster seems to works correctly but when I start corosync and
pacemaker on one of them[1] I start to have some TOTEM logs like this:

#+begin_src
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 46 47 48 49 
4a 4b 4c 4d 4e 4f
Nov 13 14:00:10 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
Nov 13 14:00:30 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 47 48 49 4a 
4b 4c 4d 4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4a 4b 4c 4d 
4e 4f
Nov 13 14:00:35 nebula3 corosync[5345]:   [TOTEM ] Retransmit List: 4b 4c 4d 4e 
4f
#+end_src

I do not understand what happens, do you have any hints?

Regards.

Footnotes: 
[1]  the VM using two cards 
http://oss.clusterlabs.org/pipermail/pacemaker/2014-November/022962.html

-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF


signature.asc
Description: PGP signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org