Hi Honza,
I carried out the test that you showed.
If a test has a problem, please point it out.
(Test1)
- Block communication on local nodes via iptables (so drop all UDP
traffic, something like "iptables -A INPUT ! -i lo -p udp -j DROP &&
iptables -A OUTPUT ! -o lo -p udp -j DROP") - and then remove this
rules, does corosync create membership correctly?
(Result1 - OK : corosync create membership correctly)
* node1
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6a ~]# iptables -A INPUT ! -i lo -p udp -j DROP && iptables -A
OUTPUT ! -o lo -p udp -j DROP
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined -----> Lone
state.
[root@bl460g6a ~]# iptables -F
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
* node2
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6b ~]# iptables -A INPUT ! -i lo -p udp -j DROP && iptables -A
OUTPUT ! -o lo -p udp -j DROP
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined -----> Lone
state.
[root@bl460g6b ~]# iptables -F
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
* node3
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6c ~]# iptables -A INPUT ! -i lo -p udp -j DROP && iptables -A
OUTPUT ! -o lo -p udp -j DROP
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined -----> Lone
state.
[root@bl460g6c ~]# iptables -F
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
(Test2)
- Unplug cables (please make sure to NOT configure network via
networkmanager. Networkmanager does ifdown and corosync doesn't work
correctly with ifdown). Then plug cables again. Is membership
reconstructed correctly?
(Result2 - OK : corosync create membership correctly))
* node1
[root@bl460g6a ~]# chkconfig --list NetworkManager
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined -----> Lone
state.
[root@bl460g6a ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
* node2
[root@bl460g6b ~]# chkconfig --list NetworkManager
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined -----> Lone
state.
[root@bl460g6b ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
* node3
[root@bl460g6c ~]# chkconfig --list NetworkManager
NetworkManager 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined -----> Lone
state.
[root@bl460g6c ~]# corosync-cmapctl | grep joined
runtime.totem.pg.mrp.srp.members.157657280.status (str) = joined ----->
Reconfiguration state.
runtime.totem.pg.mrp.srp.members.174434496.status (str) = joined
runtime.totem.pg.mrp.srp.members.191211712.status (str) = joined
> If result of both of test cases is correct membership then problem is in
> switch. If so, you can try ether corosync UDPU mode (it's slightly
> slower, but as long as GFS is not used, it's acceptable, especially for
> 3 nodes environment) or you can try change switch configuration.
Really? Problem is in switch?
I think that the phenomenon is generated depending on a way of the cutting of
the network of corosync.
I think that it is not a problem of SW.
The cutting of the network which I reported is as follows.
* x mark is cuts.
-------------------------------
| SW1 |
-------------------------------
| | |
X | |
| | |
------------ ------------ ------------
| node1 | | node2 | | node3 |
------------ ------------ ------------
| | |
| X |
| | |
-------------------------------
| SW2 |
-------------------------------
* In SW1, node3 can communicate with node2.
* In SW2, node3 can communicate with node1.
A control message of corosync goes each other, and, in the case of this
trouble, does a problem not happen?
Does it not become the factor that cannot constitute a cluster?
Best Regards,
Hideo Yamauchi.
--- On Thu, 2013/6/13, [email protected] <[email protected]>
wrote:
> Hi Honza,
>
> Thank you for comment.
> I try the test that you suggested and report a result.
>
> Many Thanks!
> Hideo Yamauchi.
>
> --- On Wed, 2013/6/12, Jan Friesse <[email protected]> wrote:
>
> > Hideo,
> > can you please try to test following things:
> >
> > - Block communication on local nodes via iptables (so drop all UDP
> > traffic, something like "iptables -A INPUT ! -i lo -p udp -j DROP &&
> > iptables -A OUTPUT ! -o lo -p udp -j DROP") - and then remove this
> > rules, does corosync create membership correctly?
> > - Unplug cables (please make sure to NOT configure network via
> > networkmanager. Networkmanager does ifdown and corosync doesn't work
> > correctly with ifdown). Then plug cables again. Is membership
> > reconstructed correctly?
> >
> > If result of both of test cases is correct membership then problem is in
> > switch. If so, you can try ether corosync UDPU mode (it's slightly
> > slower, but as long as GFS is not used, it's acceptable, especially for
> > 3 nodes environment) or you can try change switch configuration.
> >
> > Regards,
> > Honza
> >
> > [email protected] napsal(a):
> > > Hi Honza,
> > >
> > > Thank you for comments.
> > >
> > >> can you please tell me exact reproducer for physical hw? (because brctl
> > >> delif is I believe not valid in hw at all).
> > >
> > > It is the next environment that I reported a problem in the second in
> > > physical environment.
> > >
> > > -------------------------
> > > Enclosure : BladeSystem c7000 Enclosure
> > > node1, node2, node3 : HP ProLiant BL460c G6(CPU:Xeon E5540,Mem:16G) ---
> > > Blade
> > > NIC:Flex-10 Embedded Ethernet x 1(2Port)
> > > NIC:NC325m Quad Port 1Gb NIC for c-Class
> > >BladeSystem(4Port)
> > > SW : GbE2c Ethernet Blade Switch x 6
> > > -------------------------
> > >
> > > In addition, I carried out the cutting of the interface via a switch.
> > > * In the second report, I did not execute the brctl command.
> > >
> > > Is more detailed HW information necessary?
> > > If there is necessary information, I send it.
> > >
> > > Best Regards,
> > > Hideo Yamauchi.
> > >
> > >
> > > --- On Wed, 2013/6/12, Jan Friesse <[email protected]> wrote:
> > >
> > >> Hideo,
> > >> can you please tell me exact reproducer for physical hw? (because brctl
> > >> delif is I believe not valid in hw at all).
> > >>
> > >> Thanks,
> > >> Honza
> > >>
> > >> [email protected] napsal(a):
> > >>> Hi Fabio,
> > >>>
> > >>> Thank you for comment.
> > >>>
> > >>>> I'll let Honza look at it, I don't have enough physical hardware to
> > >>>> reproduce.
> > >>>
> > >>> All right.
> > >>>
> > >>> Many Thanks!
> > >>> Hideo Yamauchi.
> > >>>
> > >>>
> > >>> --- On Tue, 2013/6/11, Fabio M. Di Nitto <[email protected]> wrote:
> > >>>
> > >>>> Hi Yamauchi-san,
> > >>>>
> > >>>> I'll let Honza look at it, I don't have enough physical hardware to
> > >>>> reproduce.
> > >>>>
> > >>>> Fabio
> > >>>>
> > >>>> On 06/11/2013 01:15 AM, [email protected] wrote:
> > >>>>> Hi Fabio,
> > >>>>>
> > >>>>> Thank you for comments.
> > >>>>>
> > >>>>> We confirmed this problem in the physical environment.
> > >>>>> The communication of corosync lets eth1,eth2 go through.
> > >>>>>
> > >>>>> -------------------------------------------------------
> > >>>>> [root@bl460g6a ~]# ip addr show
> > >>>>> (snip)
> > >>>>> 3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
> > >>>>> qlen 1000
> > >>>>> link/ether f4:ce:46:b3:fe:3c brd ff:ff:ff:ff:ff:ff
> > >>>>> inet 192.168.101.9/24 brd 192.168.101.255 scope global eth1
> > >>>>> inet6 fe80::f6ce:46ff:feb3:fe3c/64 scope link
> > >>>>> valid_lft forever preferred_lft forever
> > >>>>> 4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP
> > >>>>> qlen 1000
> > >>>>> link/ether 18:a9:05:78:6c:f0 brd ff:ff:ff:ff:ff:ff
> > >>>>> inet 192.168.102.9/24 brd 192.168.102.255 scope global eth2
> > >>>>> inet6 fe80::1aa9:5ff:fe78:6cf0/64 scope link
> > >>>>> valid_lft forever preferred_lft forever
> > >>>>> (snip)
> > >>>>> 8: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
> > >>>>> state UNKNOWN
> > >>>>> link/ether 52:54:00:7f:f3:0a brd ff:ff:ff:ff:ff:ff
> > >>>>> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
> > >>>>> 9: virbr0-nic: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
> > >>>>> qlen 500
> > >>>>> link/ether 52:54:00:7f:f3:0a brd ff:ff:ff:ff:ff:ff
> > >>>>> -----------------------------------------------
> > >>>>>
> > >>>>> I think that it is not a virtual environmental problem.
> > >>>>>
> > >>>>> I attach the log that I confirmed just to make sure in three
> > >>>>> Blade.(RHEL6.4)
> > >>>>> * I performed the interception of the communication with a network
> > >>>>> switch.
> > >>>>>
> > >>>>> The phenomenon is similar, and, as for one node, a loop does an
> > >>>>> OPERATIONAL state, and two other nodes do not change in an
> > >>>>> OPERATIONAL state.
> > >>>>>
> > >>>>> After all is the problem same as the bug that you taught?
> > >>>>>> Check this thread as reference:
> > >>>>>> http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html
> > >>>>>
> > >>>>>
> > >>>>> Best Regards,
> > >>>>> Hideo Yamauchi.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --- On Fri, 2013/5/31, Fabio M. Di Nitto <[email protected]> wrote:
> > >>>>>
> > >>>>>> On 5/31/2013 7:12 AM, [email protected] wrote:
> > >>>>>>> Hi All,
> > >>>>>>>
> > >>>>>>> We discovered the problem of the network of the corosync
> > >>>>>>> communication.
> > >>>>>>>
> > >>>>>>> We composed a cluster of three nodes on KVM in corosync.
> > >>>>>>>
> > >>>>>>> Step 1) Start corosync service in all nodes.
> > >>>>>>>
> > >>>>>>> Step 2) Confirm that a cluster is comprised of all nodes definitely
> > >>>>>>> and became the OPERATIONAL state.
> > >>>>>>>
> > >>>>>>> Step 3) Cut off the network of node1(rh64-coro1) and
> > >>>>>>> node2(rh64-coro2) from a host of KVM.
> > >>>>>>>
> > >>>>>>> [root@kvm-host ~]# brctl delif virbr3 vnet5;brctl delif
> > >>>>>>>virbr2 vnet1
> > >>>>>>>
> > >>>>>>> Step 4) Because a problem occurred, we stop all nodes.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> The problem occurs at the time of step 3.
> > >>>>>>>
> > >>>>>>> One node(rh64-coro1) continues moving a state after becoming the
> > >>>>>>> OPERATIONAL state.
> > >>>>>>>
> > >>>>>>> Two nodes(rh64-coro2 and rh64-coro3) continue changing in a state.
> > >>>>>>> It seems to never change in an OPERATIONAL state while the first
> > >>>>>>> node operates.
> > >>>>>>>
> > >>>>>>> This means that two nodes(rh64-coro2 and rh64-coro3) cannot
> > >>>>>>> complete cluster constitution.
> > >>>>>>> When this network trouble happens, by the setting that corosync
> > >>>>>>> combined with Pacemaker, corosync cannot notify Pacemaker of the
> > >>>>>>> constitution change of the cluster.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Question 1) Are there any parameters to solve this problem in
> > >>>>>>> corosync.conf?
> > >>>>>>> * We bundle up an interface(Bonding) and think that it can be
> > >>>>>>>settled by appointing "rrp_mode:none", but do not want to appoint
> > >>>>>>>"rrp_mode:none".
> > >>>>>>>
> > >>>>>>> Question 2) Is this a bug? Or is it specifications of the
> > >>>>>>> communication of corosync?
> > >>>>>>
> > >>>>>> We already checked this specific test, and it appears to be a bug in
> > >>>>>> the kernel bridge code when handling multicast traffic (groups are
> > >>>>>> not
> > >>>>>> joined correctly and traffic is not forwarded).
> > >>>>>>
> > >>>>>> Check this thread as reference:
> > >>>>>> http://lists.linuxfoundation.org/pipermail/openais/2013-April/016792.html
> > >>>>>>
> > >>>>>> Thanks
> > >>>>>> Fabio
> > >>>>>>
> > >>>>>>
> > >>>>>> _______________________________________________
> > >>>>>> discuss mailing list
> > >>>>>> [email protected]
> > >>>>>> http://lists.corosync.org/mailman/listinfo/discuss
> > >>>>>>
> > >>>>
> > >>>>
> > >>>
> > >>> _______________________________________________
> > >>> discuss mailing list
> > >>> [email protected]
> > >>> http://lists.corosync.org/mailman/listinfo/discuss
> > >>
> > >>
> >
> >
>
> _______________________________________________
> discuss mailing list
> [email protected]
> http://lists.corosync.org/mailman/listinfo/discuss
>
_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss