Hello Honza, I corrected the config, but it didn't change match. Cluster is not forming properly. I shutdown iptables Log
Nov 25 17:58:05 corosync [CPG ] chosen downlist: sender r(0) ip(10.10.10.1) ; members(old:1 left:0) Nov 25 17:58:05 corosync [MAIN ] Completed service synchronization, ready to provide service. Nov 25 17:58:07 corosync [TOTEM ] A processor failed, forming new configuration. Nov 25 17:58:08 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 25 17:58:08 corosync [TOTEM ] A processor failed, forming new configuration. Nov 25 17:58:09 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Nov 25 17:58:11 corosync [TOTEM ] A processor failed, forming new configuration. But right now I see both end members pbx01*CLI> corosync show members ============================================================= === Cluster members ========================================= ============================================================= === === Node 1 === --> Group: asterisk === --> Address 1: 10.10.10.1 === Node 2 === --> Group: asterisk === --> Address 1: 10.10.10.2 === ============================================================= And this message is still flooding asterisk log. 2013-11-25 12:02:18] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6) [2013-11-25 12:02:18] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast failed (6) When do ping from asterisk it shows mac from eth0 and not eth3. pbx01*CLI> corosync ping [2013-11-25 12:03:38] NOTICE[2057]: res_corosync.c:303 ast_event_cb: (ast_event_cb) Got event PING from server with EID: 'mac of eth0' Slava. ----- Original Message ----- From: "Jan Friesse" <[email protected]> To: "Slava Bendersky" <[email protected]>, "Steven Dake" <[email protected]> Cc: [email protected] Sent: Monday, November 25, 2013 3:10:51 AM Subject: Re: [corosync] information request Slava Bendersky napsal(a): > Hello Steven, > Here testing results > Iptables is stopped both end. > > [root@eusipgw01 ~]# iptables -L -nv -x > Chain INPUT (policy ACCEPT 474551 packets, 178664760 bytes) > pkts bytes target prot opt in out source destination > > Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) > pkts bytes target prot opt in out source destination > > Chain OUTPUT (policy ACCEPT 467510 packets, 169303071 bytes) > pkts bytes target prot opt in out source destination > [root@eusipgw01 ~]# > > > First case is udpu transport and rrp: none > > totem { > version: 2 > token: 160 > token_retransmits_before_loss_const: 3 > join: 250 > consensus: 300 > vsftype: none > max_messages: 20 > threads: 0 > nodeid: 2 > rrp_mode: none > interface { > member { > memberaddr: 10.10.10.1 > } ^^^ This is problem. You must define BOTH nodes (not only remote) on BOTH sides. > ringnumber: 0 > bindnetaddr: 10.10.10.0 > mcastport: 5405 > } > transport: udpu > } > > Error > > Nov 24 14:25:29 corosync [MAIN ] Totem is unable to form a cluster because of > an operating system or network fault. The most common cause of this message > is that the local firewall is configured improperly. > This is because you defined only remote node, not the local one in member(s) section(s). Regards, Honza > pbx01*CLI> corosync show members > > ============================================================= > === Cluster members ========================================= > ============================================================= > === > === > ============================================================= > > > And the same with rrp: passive. I think unicast is more related to some > incompatibility with vmware ? Only multicast going though, bur even then it > not forming completely the cluster. > > Slava. > > ----- Original Message ----- > > From: "Steven Dake" <[email protected]> > To: "Slava Bendersky" <[email protected]>, "Digimer" <[email protected]> > Cc: [email protected] > Sent: Sunday, November 24, 2013 12:01:09 PM > Subject: Re: [corosync] information request > > > On 11/23/2013 11:20 PM, Slava Bendersky wrote: > > > > Hello Digimer, > Here from asterisk box what I see > pbx01*CLI> corosync show members > > ============================================================= > === Cluster members ========================================= > ============================================================= > === > === Node 1 > === --> Group: asterisk > === --> Address 1: 10.10.10.1 > === Node 2 > === --> Group: asterisk > === --> Address 1: 10.10.10.2 > === > ============================================================= > > [2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG > mcast failed (6) > [2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG > mcast failed (6) > > > > > These errors come from asterisk via the cpg libraries because corosync cannot > get a proper configuration. The first message on tihs thread contains the > scenarios under which those occur. In a past log you had the error indicating > a network fault. This network fault error IIRC indicates firewall is enabled. > The error from asterisk is expected if your firewall is enabled. This was > suggested before by Digimer, but can you confirm you totally disabled your > firewall on the box (rather then just configured it as you thought was > correct). > > Turn off the firewall - which will help us eliminate that as a source of the > problem. > > Next, use UDPU mode without RRP - confirm whether that works > > Next use UDPU _passive_ rrp mode - confirm whether that works > > One thing at a time in each step please. > > Regards > -steve > > > > > > Is possible that message related to permission who running corosync or > asterisk ? > > And another point is when I send ping I see MAC address of eth0 which is > default gateway and not cluster interface. > > > > Corosync does not use the gateway address in any of its routing calculations. > Instead it physically binds to the interface specified as detailed in > corosync.conf.5. By physically binding, it avoids the gateway entirely. > > Regards > -steve > > > <blockquote> > > pbx01*CLI> corosync ping > [2013-11-24 01:16:54] NOTICE[2057]: res_corosync.c:303 ast_event_cb: > (ast_event_cb) Got event PING from server with EID: 'MAC address of the eth0' > [2013-11-24 01:16:54] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG > mcast failed (6) > > > Slava. > > > ----- Original Message ----- > > From: "Slava Bendersky" <[email protected]> > To: "Digimer" <[email protected]> > Cc: [email protected] > Sent: Sunday, November 24, 2013 12:26:40 AM > Subject: Re: [corosync] information request > > Hello Digimer, > I am trying find information about vmware multicast problems. But on tcpdump > I see multicas traffic from remote end. I can't confirm if packet arrive as > should be. > Can please confirm that memberaddr: is ip address of second node ? > > 06:05:02.408204 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP > (17), length 221) > 10.10.10.1.5404 > 226.94.1.1.5405: [udp sum ok] UDP, length 193 > 06:05:02.894935 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP > (17), length 221) > 10.10.10.2.5404 > 226.94.1.1.5405: [bad udp cksum 1a8c!] UDP, length 193 > > > Slava. > > > > ----- Original Message ----- > > From: "Digimer" <[email protected]> > To: "Slava Bendersky" <[email protected]> > Cc: [email protected] > Sent: Saturday, November 23, 2013 11:54:55 PM > Subject: Re: [corosync] information request > > If I recall correctly, VMWare doesn't do multicast properly. I'm not > sure though, I don't use it. > > Try unicast with no RRP. See if that works. > > On 23/11/13 23:16, Slava Bendersky wrote: >> Hello Digimer, >> All machines are rhel 6.4 based on vmware , there not physical switch >> only from vmware. I set rrp to none and cluster is formed. >> With this config I am getting constant error messages. >> >> [root@eusipgw01 ~]# cat /etc/redhat-release >> Red Hat Enterprise Linux Server release 6.4 (Santiago) >> >> [root@eusipgw01 ~]# rpm -qa | grep corosync >> corosync-1.4.1-15.el6.x86_64 >> corosynclib-1.4.1-15.el6.x86_64 >> >> >> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) >> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) >> >> iptables >> >> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 5404:5407 -j >> NFLOG --nflog-prefix "dmz_ext2fw: " --nflog-group 2 >> -A INPUT -i eth1 -m pkttype --pkt-type multicast -j NFLOG >> --nflog-prefix "dmz_ext2fw: " --nflog-group 2 >> -A INPUT -i eth1 -m pkttype --pkt-type unicast -j NFLOG --nflog-prefix >> "dmz_ext2fw: " --nflog-group 2 >> -A INPUT -i eth1 -p igmp -j NFLOG --nflog-prefix "dmz_ext2fw: " >> --nflog-group 2 >> -A INPUT -j ACCEPT >> >> >> ------------------------------------------------------------------------ >> *From: *"Digimer" <[email protected]> >> *To: *"Slava Bendersky" <[email protected]> >> *Cc: *[email protected] >> *Sent: *Saturday, November 23, 2013 10:34:00 PM >> *Subject: *Re: [corosync] information request >> >> I don't think you ever said what OS you have. I've never had to set >> anything in sysctl.conf on RHEL/CentOS 6. Did you try disabling RRP >> entirely? If you have a managed switch, make sure persistent multicast >> groups are enabled or try a different switch entirely. >> >> *Something* is interrupting your network traffic. What does >> iptables-save show? Are these physical or virtual machines? >> >> The more information about your environment that you can share, the >> better we can help. >> >> On 23/11/13 22:29, Slava Bendersky wrote: >>> Hello Digimer, >>> As an idea, might be some settings in sysctl.conf ? >>> >>> Slava. >>> >>> >>> ------------------------------------------------------------------------ >>> *From: *"Slava Bendersky" <[email protected]> >>> *To: *"Digimer" <[email protected]> >>> *Cc: *[email protected] >>> *Sent: *Saturday, November 23, 2013 10:27:22 PM >>> *Subject: *Re: [corosync] information request >>> >>> Hello Digimer, >>> Yes I set to passive and selinux is disabled >>> >>> [root@eusipgw01 ~]# sestatus >>> SELinux status: disabled >>> [root@eusipgw01 ~]# cat /etc/corosync/corosync.conf >>> totem { >>> version: 2 >>> token: 160 >>> token_retransmits_before_loss_const: 3 >>> join: 250 >>> consensus: 300 >>> vsftype: none >>> max_messages: 20 >>> threads: 0 >>> nodeid: 2 >>> rrp_mode: passive >>> interface { >>> ringnumber: 0 >>> bindnetaddr: 10.10.10.0 >>> mcastaddr: 226.94.1.1 >>> mcastport: 5405 >>> } >>> } >>> >>> logging { >>> fileline: off >>> to_stderr: yes >>> to_logfile: yes >>> to_syslog: off >>> logfile: /var/log/cluster/corosync.log >>> debug: off >>> timestamp: on >>> logger_subsys { >>> subsys: AMF >>> debug: off >>> } >>> } >>> >>> >>> Slava. >>> >>> ------------------------------------------------------------------------ >>> *From: *"Digimer" <[email protected]> >>> *To: *"Slava Bendersky" <[email protected]> >>> *Cc: *"Steven Dake" <[email protected]> , [email protected] >>> *Sent: *Saturday, November 23, 2013 7:04:43 PM >>> *Subject: *Re: [corosync] information request >>> >>> First up, I'm not Steven. Secondly, did you follow Steven's >>> recommendation to not use active RRP? Does the cluster form with no RRP >>> at all? Is selinux enabled? >>> >>> On 23/11/13 18:29, Slava Bendersky wrote: >>>> Hello Steven, >>>> In multicast it log filling with this message >>>> >>>> Nov 24 00:26:28 corosync [TOTEM ] A processor failed, forming new >>>> configuration. >>>> Nov 24 00:26:28 corosync [TOTEM ] A processor joined or left the >>>> membership and a new membership was formed. >>>> Nov 24 00:26:31 corosync [CPG ] chosen downlist: sender r(0) >>>> ip(10.10.10.1) ; members(old:2 left:0) >>>> Nov 24 00:26:31 corosync [MAIN ] Completed service synchronization, >>>> ready to provide service. >>>> >>>> In uudp it not working at all. >>>> >>>> Slava. >>>> >>>> >>>> ------------------------------------------------------------------------ >>>> *From: *"Digimer" <[email protected]> >>>> *To: *"Slava Bendersky" <[email protected]> >>>> *Cc: *"Steven Dake" <[email protected]> , [email protected] >>>> *Sent: *Saturday, November 23, 2013 6:05:56 PM >>>> *Subject: *Re: [corosync] information request >>>> >>>> So multicast works with the firewall disabled? >>>> >>>> On 23/11/13 17:28, Slava Bendersky wrote: >>>>> Hello Steven, >>>>> I disabled iptables and no difference, error message the same, but at >>>>> least in multicast is wasn't generate the error. >>>>> >>>>> >>>>> Slava. >>>>> >>>>> ------------------------------------------------------------------------ >>>>> *From: *"Digimer" <[email protected]> >>>>> *To: *"Slava Bendersky" <[email protected]> , "Steven Dake" >>>>> <[email protected]> >>>>> *Cc: *[email protected] >>>>> *Sent: *Saturday, November 23, 2013 4:37:36 PM >>>>> *Subject: *Re: [corosync] information request >>>>> >>>>> Does either mcast or unicast work if you disable the firewall? If so, >>>>> then at least you know for sure that iptables is the problem. >>>>> >>>>> The link here shows the iptables rules I use (for corosync in mcast and >>>>> other apps): >>>>> >>>>> https://alteeve.ca/w/AN!Cluster_Tutorial_2#Configuring_iptables >>>>> >>>>> digimer >>>>> >>>>> On 23/11/13 16:12, Slava Bendersky wrote: >>>>>> Hello Steven, >>>>>> Than what I see when setup through UDPU >>>>>> >>>>>> Nov 23 22:08:13 corosync [MAIN ] Compatibility mode set to whitetank. >>>>>> Using V1 and V2 of the synchronization engine. >>>>>> Nov 23 22:08:13 corosync [TOTEM ] adding new UDPU member {10.10.10.1} >>>>>> Nov 23 22:08:16 corosync [MAIN ] Totem is unable to form a cluster >>>>>> because of an operating system or network fault. The most common cause >>>>>> of this message is that the local firewall is configured improperly. >>>>>> >>>>>> >>>>>> Might be missing some firewall rules ? I allowed unicast. >>>>>> >>>>>> Slava. >>>>>> >>>>>> >> ------------------------------------------------------------------------ >>>>>> *From: *"Steven Dake" <[email protected]> >>>>>> *To: *"Slava Bendersky" <[email protected]> >>>>>> *Cc: *[email protected] >>>>>> *Sent: *Saturday, November 23, 2013 10:33:31 AM >>>>>> *Subject: *Re: [corosync] information request >>>>>> >>>>>> >>>>>> On 11/23/2013 08:23 AM, Slava Bendersky wrote: >>>>>> >>>>>> Hello Steven, >>>>>> >>>>>> My setup >>>>>> >>>>>> 10.10.10.1 primary server -----EoIP tunnel vpn ipsec ----- dr >> server >>>>>> 10.10.10.2 >>>>>> >>>>>> On both servers is 2 interfaces eth0 which default gw out and eth1 >>>>>> where corosync live. >>>>>> >>>>>> Iptables: >>>>>> >>>>>> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport >>> 5404:5407 >>>>>> -A INPUT -i eth1 -m pkttype --pkt-type multicast >>>>>> -A INPUT -i eth1 -p igmp >>>>>> >>>>>> >>>>>> Corosync.conf >>>>>> >>>>>> totem { >>>>>> version: 2 >>>>>> token: 160 >>>>>> token_retransmits_before_loss_const: 3 >>>>>> join: 250 >>>>>> consensus: 300 >>>>>> vsftype: none >>>>>> max_messages: 20 >>>>>> threads: 0 >>>>>> nodeid: 2 >>>>>> rrp_mode: active >>>>>> interface { >>>>>> ringnumber: 0 >>>>>> bindnetaddr: 10.10.10.0 >>>>>> mcastaddr: 226.94.1.1 >>>>>> mcastport: 5405 >>>>>> } >>>>>> } >>>>>> >>>>>> Join message >>>>>> >>>>>> [root@eusipgw01 ~]# corosync-objctl | grep member >>>>>> runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(10.10.10.2) >>>>>> runtime.totem.pg.mrp.srp.members.2.join_count=1 >>>>>> runtime.totem.pg.mrp.srp.members.2.status=joined >>>>>> runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(10.10.10.1) >>>>>> runtime.totem.pg.mrp.srp.members.1.join_count=254 >>>>>> runtime.totem.pg.mrp.srp.members.1.status=joined >>>>>> >>>>>> Is it possible that ping sends out of wrong interface ? >>>>>> >>>>>> Slava, >>>>>> >>>>>> I wouldn't expect so. >>>>>> >>>>>> Which version? >>>>>> >>>>>> Have you tried udpu instead? If not, it is preferable to multicast >>>>>> unless you want absolute performance on cpg groups. In most cases the >>>>>> performance difference is very small and not worth the trouble of >>>>>> setting up multicast in your network. >>>>>> >>>>>> Fabio had indicated rrp active mode is broken. I don't know the >>>>>> details, but try passive RRP - it is actually better then active >>>> IMNSHO :) >>>>>> >>>>>> Regards >>>>>> -steve >>>>>> >>>>>> Slava. >>>>>> >>>>>> >>>>> ------------------------------------------------------------------------ >>>>>> *From: *"Steven Dake" <[email protected]> >>>>>> *To: *"Slava Bendersky" <[email protected]> , >>>> [email protected] >>>>>> *Sent: *Saturday, November 23, 2013 6:01:11 AM >>>>>> *Subject: *Re: [corosync] information request >>>>>> >>>>>> >>>>>> On 11/23/2013 12:29 AM, Slava Bendersky wrote: >>>>>> >>>>>> Hello Everyone, >>>>>> Corosync run on box with 2 Ethernet interfaces. >>>>>> I am getting this message >>>>>> CPG mcast failed (6) >>>>>> >>>>>> Any information thank you in advance. >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> https://github.com/corosync/corosync/blob/master/include/corosync/corotypes.h#L84 >> >>>>>> >>>>>> This can occur because: >>>>>> a) firewall is enabled - there should be something in the logs >>>>>> telling you to properly configure the firewall >>>>>> b) a config change is in progress - this is a normal response, and >>>>>> you should try the request again >>>>>> c) a bug in the synchronization code is resulting in a blocked >>>>>> unsynced cluster >>>>>> >>>>>> c is very unlikely at this point. >>>>>> >>>>>> 2 ethernet interfaces = rrp mode, bonding, or something else? >>>>>> >>>>>> Digimer needs moar infos :) >>>>>> >>>>>> Regards >>>>>> -steve >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> discuss mailing list >>>>>> [email protected] >>>>>> http://lists.corosync.org/mailman/listinfo/discuss >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> discuss mailing list >>>>>> [email protected] >>>>>> http://lists.corosync.org/mailman/listinfo/discuss >>>>>> >>>>> >>>>> >>>>> -- >>>>> Digimer >>>>> Papers and Projects: https://alteeve.ca/w/ >>>>> What if the cure for cancer is trapped in the mind of a person without >>>>> access to education? >>>>> >>>> >>>> >>>> -- >>>> Digimer >>>> Papers and Projects: https://alteeve.ca/w/ >>>> What if the cure for cancer is trapped in the mind of a person without >>>> access to education? >>>> >>> >>> >>> -- >>> Digimer >>> Papers and Projects: https://alteeve.ca/w/ >>> What if the cure for cancer is trapped in the mind of a person without >>> access to education? >>> >>> >>> _______________________________________________ >>> discuss mailing list >>> [email protected] >>> http://lists.corosync.org/mailman/listinfo/discuss >>> >> >> >> -- >> Digimer >> Papers and Projects: https://alteeve.ca/w/ >> What if the cure for cancer is trapped in the mind of a person without >> access to education? >> > > > > > _______________________________________________ > discuss mailing list > [email protected] > http://lists.corosync.org/mailman/listinfo/discuss >
_______________________________________________ discuss mailing list [email protected] http://lists.corosync.org/mailman/listinfo/discuss
