thanks jan. but i am using the newest rhel release and i have still the issue. i could optimize it with #!/bin/bash echo 1 > /sys/class/net/virbr0/bridge/multicast_querier echo 0 > /sys/class/net/virbr0/bridge/multicast_snooping
echo "cat /sys/class/net/virbr0/bridge/multicast_snooping" cat /sys/class/net/virbr0/bridge/multicast_snooping echo "cat /sys/class/net/virbr0/bridge/multicast_querier" cat /sys/class/net/virbr0/bridge/multicast_querier echo 1 > /sys/class/net/br0/bridge/multicast_querier echo 0 > /sys/class/net/br0/bridge/multicast_snooping echo "cat /sys/class/net/br0/bridge/multicast_snooping" cat /sys/class/net/br0/bridge/multicast_snooping echo "cat /sys/class/net/br0/bridge/multicast_querier" cat /sys/class/net/br0/bridge/multicast_querier echo 1 > /sys/class/net/br1/bridge/multicast_querier echo 0 > /sys/class/net/br1/bridge/multicast_snooping echo "cat /sys/class/net/br1/bridge/multicast_snooping" cat /sys/class/net/br1/bridge/multicast_snooping echo "cat /sys/class/net/br1/bridge/multicast_querier" cat /sys/class/net/br1/bridge/multicast_querier but after a few days the cluster fence the other node -> network failure.... info: ais_mark_unseen_peer_dead: Node .com was not seen in the previous transition Mar 04 19:23:33 corosync [pcmk ] info: update_member: Node 352321546/u.comis now: lost Mar 04 19:23:33 corosync [pcmk ] info: send_member_notification: Sending membership update 780 to 2 children 2014-02-17 10:17 GMT+01:00 Jan Friesse <jfrie...@redhat.com>: > Beo, > this looks like known (and already fixed) problem in kernel. Take a look > to https://bugzilla.redhat.com/show_bug.cgi?id=880035 and specially > comment 21. Kernel update helped that time. > > Honza > > Beo Banks napsal(a): > > hi stefan, >> >> it seems that's more stable but after 2 minute the issue is back again. >> hopefully isn't a bug because it can reproduce it >> node2 sents only unicast at sequenz 256... >> >> node1 >> >> omping 10.0.0.22 10.0.0.21 >> >> >> >> 10.0.0.22 : unicast, seq=257, size=69 bytes, dist=0, time=0.666ms >> >> 10.0.0.22 : multicast, seq=257, size=69 bytes, dist=0, time=0.677ms >> >> 10.0.0.22 : unicast, seq=258, size=69 bytes, dist=0, time=0.600ms >> >> 10.0.0.22 : multicast, seq=258, size=69 bytes, dist=0, time=0.610ms >> >> 10.0.0.22 : unicast, seq=259, size=69 bytes, dist=0, time=0.693ms >> >> 10.0.0.22 : multicast, seq=259, size=69 bytes, dist=0, time=0.702ms >> >> 10.0.0.22 : unicast, seq=260, size=69 bytes, dist=0, time=0.674ms >> >> 10.0.0.22 : multicast, seq=260, size=69 bytes, dist=0, time=0.685ms >> >> 10.0.0.22 : unicast, seq=261, size=69 bytes, dist=0, time=0.658ms >> >> 10.0.0.22 : multicast, seq=261, size=69 bytes, dist=0, time=0.669ms >> >> 10.0.0.22 : unicast, seq=262, size=69 bytes, dist=0, time=0.834ms >> >> 10.0.0.22 : multicast, seq=262, size=69 bytes, dist=0, time=0.845ms >> >> 10.0.0.22 : unicast, seq=263, size=69 bytes, dist=0, time=0.666ms >> >> 10.0.0.22 : multicast, seq=263, size=69 bytes, dist=0, time=0.677ms >> >> 10.0.0.22 : unicast, seq=264, size=69 bytes, dist=0, time=0.675ms >> >> 10.0.0.22 : multicast, seq=264, size=69 bytes, dist=0, time=0.687ms >> >> 10.0.0.22 : waiting for response msg >> >> 10.0.0.22 : server told us to stop >> >> ^C >> >> 10.0.0.22 : unicast, xmt/rcv/%loss = 264/264/0%, min/avg/max/std-dev = >> 0.542/0.663/0.860/0.035 >> >> 10.0.0.22 : multicast, xmt/rcv/%loss = 264/264/0%, min/avg/max/std-dev = >> 0.553/0.675/0.876/0.035 >> >> node2: >> >> 10.0.0.21 : multicast, seq=251, size=69 bytes, dist=0, time=0.703ms >> 10.0.0.21 : unicast, seq=252, size=69 bytes, dist=0, time=0.714ms >> 10.0.0.21 : multicast, seq=252, size=69 bytes, dist=0, time=0.725ms >> 10.0.0.21 : unicast, seq=253, size=69 bytes, dist=0, time=0.662ms >> 10.0.0.21 : multicast, seq=253, size=69 bytes, dist=0, time=0.672ms >> 10.0.0.21 : unicast, seq=254, size=69 bytes, dist=0, time=0.662ms >> 10.0.0.21 : multicast, seq=254, size=69 bytes, dist=0, time=0.673ms >> 10.0.0.21 : unicast, seq=255, size=69 bytes, dist=0, time=0.668ms >> 10.0.0.21 : multicast, seq=255, size=69 bytes, dist=0, time=0.679ms >> 10.0.0.21 : unicast, seq=256, size=69 bytes, dist=0, time=0.674ms >> 10.0.0.21 : multicast, seq=256, size=69 bytes, dist=0, time=0.687ms >> 10.0.0.21 : unicast, seq=257, size=69 bytes, dist=0, time=0.618ms >> 10.0.0.21 : unicast, seq=258, size=69 bytes, dist=0, time=0.659ms >> 10.0.0.21 : unicast, seq=259, size=69 bytes, dist=0, time=0.705ms >> 10.0.0.21 : unicast, seq=260, size=69 bytes, dist=0, time=0.682ms >> 10.0.0.21 : unicast, seq=261, size=69 bytes, dist=0, time=0.760ms >> 10.0.0.21 : unicast, seq=262, size=69 bytes, dist=0, time=0.665ms >> 10.0.0.21 : unicast, seq=263, size=69 bytes, dist=0, time=0.711ms >> ^C >> 10.0.0.21 : unicast, xmt/rcv/%loss = 263/263/0%, min/avg/max/std-dev = >> 0.539/0.661/0.772/0.037 >> 10.0.0.21 : multicast, xmt/rcv/%loss = 263/256/2%, min/avg/max/std-dev = >> 0.583/0.674/0.786/0.033 >> >> >> >> >> 2014-02-14 9:59 GMT+01:00 Stefan Bauer <stefan.ba...@cubewerk.de>: >> >> you have to disable all offloading features (rx, tx, tso...) >>> >>> >>> Mit freundlichen Grüßen >>> >>> Stefan Bauer >>> -- >>> Cubewerk GmbH >>> Herzog-Otto-Straße 32 >>> 83308 Trostberg >>> 08621 - 99 60 237 >>> HRB 22195 AG Traunstein >>> GF Stefan Bauer >>> >>> Am 14.02.2014 um 09:40 schrieb "Beo Banks" <beo.ba...@googlemail.com>: >>> >>> ethtool -K eth0 tx off >>> ethtool -K eth1 tx off >>> >>> same result...retransmit issue >>> >>> >>> 2014-02-14 9:31 GMT+01:00 Beo Banks <beo.ba...@googlemail.com>: >>> >>> i have also try >>>> >>>> "No more delay when you disable multicast snooping on the host:" >>>> >>>> echo 0 > /sys/devices/virtual/net/br1/bridge/multicast_router >>>> echo 0 > /sys/devices/virtual/net/br1/bridge/multicast_snooping >>>> >>>> >>>> 2014-02-14 9:28 GMT+01:00 Beo Banks <beo.ba...@googlemail.com>: >>>> >>>> @jan and stefan >>>> >>>>> >>>>> must i set it for both bridges >>>>> eth1 (br1) eth0 (br0) on the host or guest ? >>>>> >>>>> >>>>> 2014-02-14 9:06 GMT+01:00 Jan Friesse <jfrie...@redhat.com>: >>>>> >>>>> Beo, >>>>> >>>>>> do you experiencing cluster split? If answer is no, then you don't >>>>>> need >>>>>> to do anything. Maybe network buffer is just filled. But, if answer >>>>>> is yes, >>>>>> try reduce mtu size (netmtu in configuration) to value like 1000. >>>>>> >>>>>> Regards, >>>>>> Honza >>>>>> >>>>>> Beo Banks napsal(a): >>>>>> >>>>>> Hi, >>>>>>> >>>>>>> i have a fresh 2 node cluster (kvm host1 -> guest = nodeA | kvm host2 >>>>>>> -> >>>>>>> guest = NodeB) and it seems to work but from time to time i have a >>>>>>> lot >>>>>>> of >>>>>>> errors like >>>>>>> >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 196 198 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 197 199 >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 197 199 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 196 198 >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 196 198 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 197 199 >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 197 199 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 196 198 >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 196 198 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 197 199 >>>>>>> Feb 13 13:41:04 corosync [TOTEM ] Retransmit List: 197 199 184 185 >>>>>>> 186 >>>>>>> 187 >>>>>>> 188 189 18a 18b 18c 18d 18e 18f 190 191 192 193 194 195 196 198 >>>>>>> i used the newest rhel 6.5 version. >>>>>>> >>>>>>> i have also already try solve the issue with >>>>>>> echo 1 > /sys/class/net/virbr0/bridge/multicast_querier (host >>>>>>> system) >>>>>>> but no chance... >>>>>>> >>>>>>> i have disable iptables,selinux..same issue >>>>>>> >>>>>>> how can solve it? >>>>>>> >>>>>>> thanks beo >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>>> >>>>>>> Project Home: http://www.clusterlabs.org >>>>>>> Getting started: http://www.clusterlabs.org/ >>>>>>> doc/Cluster_from_Scratch.pdf >>>>>>> Bugs: http://bugs.clusterlabs.org >>>>>>> >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>>>> >>>>>> Project Home: http://www.clusterlabs.org >>>>>> Getting started: http://www.clusterlabs.org/ >>>>>> doc/Cluster_from_Scratch.pdf >>>>>> Bugs: http://bugs.clusterlabs.org >>>>>> >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>> >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> >>> Project Home: http://www.clusterlabs.org >>> >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> >>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >>> >>> >> >> >> _______________________________________________ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> >> > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org