Re: [corosync] information request

Slava Bendersky Mon, 25 Nov 2013 09:07:07 -0800

Hello Honza, 
I corrected the config, but it didn't change match. Cluster is not forming 
properly. 
I shutdown iptables 
Log


Nov 25 17:58:05 corosync [CPG ] chosen downlist: sender r(0) ip(10.10.10.1) ; 
members(old:1 left:0) 
Nov 25 17:58:05 corosync [MAIN ] Completed service synchronization, ready to 
provide service. 
Nov 25 17:58:07 corosync [TOTEM ] A processor failed, forming new 
configuration. 
Nov 25 17:58:08 corosync [TOTEM ] A processor joined or left the membership and 
a new membership was formed. 
Nov 25 17:58:08 corosync [TOTEM ] A processor failed, forming new 
configuration. 
Nov 25 17:58:09 corosync [TOTEM ] A processor joined or left the membership and 
a new membership was formed. 
Nov 25 17:58:11 corosync [TOTEM ] A processor failed, forming new 
configuration. 

But right now I see both end members 

pbx01*CLI> corosync show members 

============================================================= 
=== Cluster members ========================================= 
============================================================= 
=== 
=== Node 1 
=== --> Group: asterisk 
=== --> Address 1: 10.10.10.1 
=== Node 2 
=== --> Group: asterisk 
=== --> Address 1: 10.10.10.2 
=== 
============================================================= 

And this message is still flooding asterisk log. 

2013-11-25 12:02:18] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast 
failed (6) 
[2013-11-25 12:02:18] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG mcast 
failed (6) 


When do ping from asterisk it shows mac from eth0 and not eth3. 

pbx01*CLI> corosync ping 
[2013-11-25 12:03:38] NOTICE[2057]: res_corosync.c:303 ast_event_cb: 
(ast_event_cb) Got event PING from server with EID: 'mac of eth0' 

Slava. 


----- Original Message -----

From: "Jan Friesse" <[email protected]> 
To: "Slava Bendersky" <[email protected]>, "Steven Dake" 
<[email protected]> 
Cc: [email protected] 
Sent: Monday, November 25, 2013 3:10:51 AM 
Subject: Re: [corosync] information request 

Slava Bendersky napsal(a): 
> Hello Steven, 
> Here testing results 
> Iptables is stopped both end. 
> 
> [root@eusipgw01 ~]# iptables -L -nv -x 
> Chain INPUT (policy ACCEPT 474551 packets, 178664760 bytes) 
> pkts bytes target prot opt in out source destination 
> 
> Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) 
> pkts bytes target prot opt in out source destination 
> 
> Chain OUTPUT (policy ACCEPT 467510 packets, 169303071 bytes) 
> pkts bytes target prot opt in out source destination 
> [root@eusipgw01 ~]# 
> 
> 
> First case is udpu transport and rrp: none 
> 
> totem { 
> version: 2 
> token: 160 
> token_retransmits_before_loss_const: 3 
> join: 250 
> consensus: 300 
> vsftype: none 
> max_messages: 20 
> threads: 0 
> nodeid: 2 
> rrp_mode: none 
> interface { 
> member { 
> memberaddr: 10.10.10.1 
> } 

^^^ This is problem. You must define BOTH nodes (not only remote) on 
BOTH sides. 

> ringnumber: 0 
> bindnetaddr: 10.10.10.0 
> mcastport: 5405 
> } 
> transport: udpu 
> } 
> 
> Error 
> 
> Nov 24 14:25:29 corosync [MAIN ] Totem is unable to form a cluster because of 
> an operating system or network fault. The most common cause of this message 
> is that the local firewall is configured improperly. 
> 

This is because you defined only remote node, not the local one in 
member(s) section(s). 

Regards, 
Honza 

> pbx01*CLI> corosync show members 
> 
> ============================================================= 
> === Cluster members ========================================= 
> ============================================================= 
> === 
> === 
> ============================================================= 
> 
> 
> And the same with rrp: passive. I think unicast is more related to some 
> incompatibility with vmware ? Only multicast going though, bur even then it 
> not forming completely the cluster. 
> 
> Slava. 
> 
> ----- Original Message ----- 
> 
> From: "Steven Dake" <[email protected]> 
> To: "Slava Bendersky" <[email protected]>, "Digimer" <[email protected]> 
> Cc: [email protected] 
> Sent: Sunday, November 24, 2013 12:01:09 PM 
> Subject: Re: [corosync] information request 
> 
> 
> On 11/23/2013 11:20 PM, Slava Bendersky wrote: 
> 
> 
> 
> Hello Digimer, 
> Here from asterisk box what I see 
> pbx01*CLI> corosync show members 
> 
> ============================================================= 
> === Cluster members ========================================= 
> ============================================================= 
> === 
> === Node 1 
> === --> Group: asterisk 
> === --> Address 1: 10.10.10.1 
> === Node 2 
> === --> Group: asterisk 
> === --> Address 1: 10.10.10.2 
> === 
> ============================================================= 
> 
> [2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG 
> mcast failed (6) 
> [2013-11-24 01:12:43] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG 
> mcast failed (6) 
> 
> 
> 
> 
> These errors come from asterisk via the cpg libraries because corosync cannot 
> get a proper configuration. The first message on tihs thread contains the 
> scenarios under which those occur. In a past log you had the error indicating 
> a network fault. This network fault error IIRC indicates firewall is enabled. 
> The error from asterisk is expected if your firewall is enabled. This was 
> suggested before by Digimer, but can you confirm you totally disabled your 
> firewall on the box (rather then just configured it as you thought was 
> correct). 
> 
> Turn off the firewall - which will help us eliminate that as a source of the 
> problem. 
> 
> Next, use UDPU mode without RRP - confirm whether that works 
> 
> Next use UDPU _passive_ rrp mode - confirm whether that works 
> 
> One thing at a time in each step please. 
> 
> Regards 
> -steve 
> 
> 
> 
>  
> 
> Is possible that message related to permission who running corosync or 
> asterisk ? 
> 
> And another point is when I send ping I see MAC address of eth0 which is 
> default gateway and not cluster interface. 
> 
> 
>  
> Corosync does not use the gateway address in any of its routing calculations. 
> Instead it physically binds to the interface specified as detailed in 
> corosync.conf.5. By physically binding, it avoids the gateway entirely. 
> 
> Regards 
> -steve 
> 
> 
> <blockquote> 
> 
> pbx01*CLI> corosync ping 
> [2013-11-24 01:16:54] NOTICE[2057]: res_corosync.c:303 ast_event_cb: 
> (ast_event_cb) Got event PING from server with EID: 'MAC address of the eth0' 
> [2013-11-24 01:16:54] WARNING[2057]: res_corosync.c:316 ast_event_cb: CPG 
> mcast failed (6) 
> 
> 
> Slava. 
> 
> 
> ----- Original Message ----- 
> 
> From: "Slava Bendersky" <[email protected]> 
> To: "Digimer" <[email protected]> 
> Cc: [email protected] 
> Sent: Sunday, November 24, 2013 12:26:40 AM 
> Subject: Re: [corosync] information request 
> 
> Hello Digimer, 
> I am trying find information about vmware multicast problems. But on tcpdump 
> I see multicas traffic from remote end. I can't confirm if packet arrive as 
> should be. 
> Can please confirm that memberaddr: is ip address of second node ? 
> 
> 06:05:02.408204 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP 
> (17), length 221) 
> 10.10.10.1.5404 > 226.94.1.1.5405: [udp sum ok] UDP, length 193 
> 06:05:02.894935 IP (tos 0x0, ttl 1, id 0, offset 0, flags [DF], proto UDP 
> (17), length 221) 
> 10.10.10.2.5404 > 226.94.1.1.5405: [bad udp cksum 1a8c!] UDP, length 193 
> 
> 
> Slava. 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Digimer" <[email protected]> 
> To: "Slava Bendersky" <[email protected]> 
> Cc: [email protected] 
> Sent: Saturday, November 23, 2013 11:54:55 PM 
> Subject: Re: [corosync] information request 
> 
> If I recall correctly, VMWare doesn't do multicast properly. I'm not 
> sure though, I don't use it. 
> 
> Try unicast with no RRP. See if that works. 
> 
> On 23/11/13 23:16, Slava Bendersky wrote: 
>> Hello Digimer, 
>> All machines are rhel 6.4 based on vmware , there not physical switch 
>> only from vmware. I set rrp to none and cluster is formed. 
>> With this config I am getting constant error messages. 
>> 
>> [root@eusipgw01 ~]# cat /etc/redhat-release 
>> Red Hat Enterprise Linux Server release 6.4 (Santiago) 
>> 
>> [root@eusipgw01 ~]# rpm -qa | grep corosync 
>> corosync-1.4.1-15.el6.x86_64 
>> corosynclib-1.4.1-15.el6.x86_64 
>> 
>> 
>> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) 
>> [2013-11-23 22:46:20] WARNING[2057] res_corosync.c: CPG mcast failed (6) 
>> 
>> iptables 
>> 
>> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 5404:5407 -j 
>> NFLOG --nflog-prefix "dmz_ext2fw: " --nflog-group 2 
>> -A INPUT -i eth1 -m pkttype --pkt-type multicast -j NFLOG 
>> --nflog-prefix "dmz_ext2fw: " --nflog-group 2 
>> -A INPUT -i eth1 -m pkttype --pkt-type unicast -j NFLOG --nflog-prefix 
>> "dmz_ext2fw: " --nflog-group 2 
>> -A INPUT -i eth1 -p igmp -j NFLOG --nflog-prefix "dmz_ext2fw: " 
>> --nflog-group 2 
>> -A INPUT -j ACCEPT 
>> 
>> 
>> ------------------------------------------------------------------------ 
>> *From: *"Digimer" <[email protected]> 
>> *To: *"Slava Bendersky" <[email protected]> 
>> *Cc: *[email protected] 
>> *Sent: *Saturday, November 23, 2013 10:34:00 PM 
>> *Subject: *Re: [corosync] information request 
>> 
>> I don't think you ever said what OS you have. I've never had to set 
>> anything in sysctl.conf on RHEL/CentOS 6. Did you try disabling RRP 
>> entirely? If you have a managed switch, make sure persistent multicast 
>> groups are enabled or try a different switch entirely. 
>> 
>> *Something* is interrupting your network traffic. What does 
>> iptables-save show? Are these physical or virtual machines? 
>> 
>> The more information about your environment that you can share, the 
>> better we can help. 
>> 
>> On 23/11/13 22:29, Slava Bendersky wrote: 
>>> Hello Digimer, 
>>> As an idea, might be some settings in sysctl.conf ? 
>>> 
>>> Slava. 
>>> 
>>> 
>>> ------------------------------------------------------------------------ 
>>> *From: *"Slava Bendersky" <[email protected]> 
>>> *To: *"Digimer" <[email protected]> 
>>> *Cc: *[email protected] 
>>> *Sent: *Saturday, November 23, 2013 10:27:22 PM 
>>> *Subject: *Re: [corosync] information request 
>>> 
>>> Hello Digimer, 
>>> Yes I set to passive and selinux is disabled 
>>> 
>>> [root@eusipgw01 ~]# sestatus 
>>> SELinux status: disabled 
>>> [root@eusipgw01 ~]# cat /etc/corosync/corosync.conf 
>>> totem { 
>>> version: 2 
>>> token: 160 
>>> token_retransmits_before_loss_const: 3 
>>> join: 250 
>>> consensus: 300 
>>> vsftype: none 
>>> max_messages: 20 
>>> threads: 0 
>>> nodeid: 2 
>>> rrp_mode: passive 
>>> interface { 
>>> ringnumber: 0 
>>> bindnetaddr: 10.10.10.0 
>>> mcastaddr: 226.94.1.1 
>>> mcastport: 5405 
>>> } 
>>> } 
>>> 
>>> logging { 
>>> fileline: off 
>>> to_stderr: yes 
>>> to_logfile: yes 
>>> to_syslog: off 
>>> logfile: /var/log/cluster/corosync.log 
>>> debug: off 
>>> timestamp: on 
>>> logger_subsys { 
>>> subsys: AMF 
>>> debug: off 
>>> } 
>>> } 
>>> 
>>> 
>>> Slava. 
>>> 
>>> ------------------------------------------------------------------------ 
>>> *From: *"Digimer" <[email protected]> 
>>> *To: *"Slava Bendersky" <[email protected]> 
>>> *Cc: *"Steven Dake" <[email protected]> , [email protected] 
>>> *Sent: *Saturday, November 23, 2013 7:04:43 PM 
>>> *Subject: *Re: [corosync] information request 
>>> 
>>> First up, I'm not Steven. Secondly, did you follow Steven's 
>>> recommendation to not use active RRP? Does the cluster form with no RRP 
>>> at all? Is selinux enabled? 
>>> 
>>> On 23/11/13 18:29, Slava Bendersky wrote: 
>>>> Hello Steven, 
>>>> In multicast it log filling with this message 
>>>> 
>>>> Nov 24 00:26:28 corosync [TOTEM ] A processor failed, forming new 
>>>> configuration. 
>>>> Nov 24 00:26:28 corosync [TOTEM ] A processor joined or left the 
>>>> membership and a new membership was formed. 
>>>> Nov 24 00:26:31 corosync [CPG ] chosen downlist: sender r(0) 
>>>> ip(10.10.10.1) ; members(old:2 left:0) 
>>>> Nov 24 00:26:31 corosync [MAIN ] Completed service synchronization, 
>>>> ready to provide service. 
>>>> 
>>>> In uudp it not working at all. 
>>>> 
>>>> Slava. 
>>>> 
>>>> 
>>>> ------------------------------------------------------------------------ 
>>>> *From: *"Digimer" <[email protected]> 
>>>> *To: *"Slava Bendersky" <[email protected]> 
>>>> *Cc: *"Steven Dake" <[email protected]> , [email protected] 
>>>> *Sent: *Saturday, November 23, 2013 6:05:56 PM 
>>>> *Subject: *Re: [corosync] information request 
>>>> 
>>>> So multicast works with the firewall disabled? 
>>>> 
>>>> On 23/11/13 17:28, Slava Bendersky wrote: 
>>>>> Hello Steven, 
>>>>> I disabled iptables and no difference, error message the same, but at 
>>>>> least in multicast is wasn't generate the error. 
>>>>> 
>>>>> 
>>>>> Slava. 
>>>>> 
>>>>> ------------------------------------------------------------------------ 
>>>>> *From: *"Digimer" <[email protected]> 
>>>>> *To: *"Slava Bendersky" <[email protected]> , "Steven Dake" 
>>>>> <[email protected]> 
>>>>> *Cc: *[email protected] 
>>>>> *Sent: *Saturday, November 23, 2013 4:37:36 PM 
>>>>> *Subject: *Re: [corosync] information request 
>>>>> 
>>>>> Does either mcast or unicast work if you disable the firewall? If so, 
>>>>> then at least you know for sure that iptables is the problem. 
>>>>> 
>>>>> The link here shows the iptables rules I use (for corosync in mcast and 
>>>>> other apps): 
>>>>> 
>>>>> https://alteeve.ca/w/AN!Cluster_Tutorial_2#Configuring_iptables 
>>>>> 
>>>>> digimer 
>>>>> 
>>>>> On 23/11/13 16:12, Slava Bendersky wrote: 
>>>>>> Hello Steven, 
>>>>>> Than what I see when setup through UDPU 
>>>>>> 
>>>>>> Nov 23 22:08:13 corosync [MAIN ] Compatibility mode set to whitetank. 
>>>>>> Using V1 and V2 of the synchronization engine. 
>>>>>> Nov 23 22:08:13 corosync [TOTEM ] adding new UDPU member {10.10.10.1} 
>>>>>> Nov 23 22:08:16 corosync [MAIN ] Totem is unable to form a cluster 
>>>>>> because of an operating system or network fault. The most common cause 
>>>>>> of this message is that the local firewall is configured improperly. 
>>>>>> 
>>>>>> 
>>>>>> Might be missing some firewall rules ? I allowed unicast. 
>>>>>> 
>>>>>> Slava. 
>>>>>> 
>>>>>> 
>> ------------------------------------------------------------------------ 
>>>>>> *From: *"Steven Dake" <[email protected]> 
>>>>>> *To: *"Slava Bendersky" <[email protected]> 
>>>>>> *Cc: *[email protected] 
>>>>>> *Sent: *Saturday, November 23, 2013 10:33:31 AM 
>>>>>> *Subject: *Re: [corosync] information request 
>>>>>> 
>>>>>> 
>>>>>> On 11/23/2013 08:23 AM, Slava Bendersky wrote: 
>>>>>> 
>>>>>> Hello Steven, 
>>>>>> 
>>>>>> My setup 
>>>>>> 
>>>>>> 10.10.10.1 primary server -----EoIP tunnel vpn ipsec ----- dr 
>> server 
>>>>>> 10.10.10.2 
>>>>>> 
>>>>>> On both servers is 2 interfaces eth0 which default gw out and eth1 
>>>>>> where corosync live. 
>>>>>> 
>>>>>> Iptables: 
>>>>>> 
>>>>>> -A INPUT -i eth1 -p udp -m state --state NEW -m udp --dport 
>>> 5404:5407 
>>>>>> -A INPUT -i eth1 -m pkttype --pkt-type multicast 
>>>>>> -A INPUT -i eth1 -p igmp 
>>>>>> 
>>>>>> 
>>>>>> Corosync.conf 
>>>>>> 
>>>>>> totem { 
>>>>>> version: 2 
>>>>>> token: 160 
>>>>>> token_retransmits_before_loss_const: 3 
>>>>>> join: 250 
>>>>>> consensus: 300 
>>>>>> vsftype: none 
>>>>>> max_messages: 20 
>>>>>> threads: 0 
>>>>>> nodeid: 2 
>>>>>> rrp_mode: active 
>>>>>> interface { 
>>>>>> ringnumber: 0 
>>>>>> bindnetaddr: 10.10.10.0 
>>>>>> mcastaddr: 226.94.1.1 
>>>>>> mcastport: 5405 
>>>>>> } 
>>>>>> } 
>>>>>> 
>>>>>> Join message 
>>>>>> 
>>>>>> [root@eusipgw01 ~]# corosync-objctl | grep member 
>>>>>> runtime.totem.pg.mrp.srp.members.2.ip=r(0) ip(10.10.10.2) 
>>>>>> runtime.totem.pg.mrp.srp.members.2.join_count=1 
>>>>>> runtime.totem.pg.mrp.srp.members.2.status=joined 
>>>>>> runtime.totem.pg.mrp.srp.members.1.ip=r(0) ip(10.10.10.1) 
>>>>>> runtime.totem.pg.mrp.srp.members.1.join_count=254 
>>>>>> runtime.totem.pg.mrp.srp.members.1.status=joined 
>>>>>> 
>>>>>> Is it possible that ping sends out of wrong interface ? 
>>>>>> 
>>>>>> Slava, 
>>>>>> 
>>>>>> I wouldn't expect so. 
>>>>>> 
>>>>>> Which version? 
>>>>>> 
>>>>>> Have you tried udpu instead? If not, it is preferable to multicast 
>>>>>> unless you want absolute performance on cpg groups. In most cases the 
>>>>>> performance difference is very small and not worth the trouble of 
>>>>>> setting up multicast in your network. 
>>>>>> 
>>>>>> Fabio had indicated rrp active mode is broken. I don't know the 
>>>>>> details, but try passive RRP - it is actually better then active 
>>>> IMNSHO :) 
>>>>>> 
>>>>>> Regards 
>>>>>> -steve 
>>>>>> 
>>>>>> Slava. 
>>>>>> 
>>>>>> 
>>>>> ------------------------------------------------------------------------ 
>>>>>> *From: *"Steven Dake" <[email protected]> 
>>>>>> *To: *"Slava Bendersky" <[email protected]> , 
>>>> [email protected] 
>>>>>> *Sent: *Saturday, November 23, 2013 6:01:11 AM 
>>>>>> *Subject: *Re: [corosync] information request 
>>>>>> 
>>>>>> 
>>>>>> On 11/23/2013 12:29 AM, Slava Bendersky wrote: 
>>>>>> 
>>>>>> Hello Everyone, 
>>>>>> Corosync run on box with 2 Ethernet interfaces. 
>>>>>> I am getting this message 
>>>>>> CPG mcast failed (6) 
>>>>>> 
>>>>>> Any information thank you in advance. 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> https://github.com/corosync/corosync/blob/master/include/corosync/corotypes.h#L84
>>  
>>>>>> 
>>>>>> This can occur because: 
>>>>>> a) firewall is enabled - there should be something in the logs 
>>>>>> telling you to properly configure the firewall 
>>>>>> b) a config change is in progress - this is a normal response, and 
>>>>>> you should try the request again 
>>>>>> c) a bug in the synchronization code is resulting in a blocked 
>>>>>> unsynced cluster 
>>>>>> 
>>>>>> c is very unlikely at this point. 
>>>>>> 
>>>>>> 2 ethernet interfaces = rrp mode, bonding, or something else? 
>>>>>> 
>>>>>> Digimer needs moar infos :) 
>>>>>> 
>>>>>> Regards 
>>>>>> -steve 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________ 
>>>>>> discuss mailing list 
>>>>>> [email protected] 
>>>>>> http://lists.corosync.org/mailman/listinfo/discuss 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________ 
>>>>>> discuss mailing list 
>>>>>> [email protected] 
>>>>>> http://lists.corosync.org/mailman/listinfo/discuss 
>>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Digimer 
>>>>> Papers and Projects: https://alteeve.ca/w/ 
>>>>> What if the cure for cancer is trapped in the mind of a person without 
>>>>> access to education? 
>>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Digimer 
>>>> Papers and Projects: https://alteeve.ca/w/ 
>>>> What if the cure for cancer is trapped in the mind of a person without 
>>>> access to education? 
>>>> 
>>> 
>>> 
>>> -- 
>>> Digimer 
>>> Papers and Projects: https://alteeve.ca/w/ 
>>> What if the cure for cancer is trapped in the mind of a person without 
>>> access to education? 
>>> 
>>> 
>>> _______________________________________________ 
>>> discuss mailing list 
>>> [email protected] 
>>> http://lists.corosync.org/mailman/listinfo/discuss 
>>> 
>> 
>> 
>> -- 
>> Digimer 
>> Papers and Projects: https://alteeve.ca/w/ 
>> What if the cure for cancer is trapped in the mind of a person without 
>> access to education? 
>> 
> 
> 
> 
> 
> _______________________________________________ 
> discuss mailing list 
> [email protected] 
> http://lists.corosync.org/mailman/listinfo/discuss 
>

_______________________________________________
discuss mailing list
[email protected]
http://lists.corosync.org/mailman/listinfo/discuss

Re: [corosync] information request

Reply via email to