Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Yes, now I have the clear experiment. Sorry, I misinformed you about adding new UDPU member - when I use DNS names in ringX_addr, I don't see such messages (for now). But, anyway, DNS names in ringX_addr seem not working, and no relevant messages are in default logs. Maybe add some validations for ringX_addr? I'm having resolvable DNS names: root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from 64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from 64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from 64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms With corosync.conf below, nothing works: ... nodelist { node { ring0_addr: node1 } node { ring0_addr: node2 } node { ring0_addr: node3 } } ... Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync Cluster Engine ('2.3.3'): started and ready to provide service. Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync built-in features: dbus testagents rdma watchdog augeas pie relro bindnow Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transport (UDP/IP Unicast). Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] The network interface [a.b.c.d] is now up. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync configuration map access [0] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cmap Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync configuration service [1] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cfg Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cpg Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync profile loading service [4] Jan 14 10:47:44 node1 corosync[15062]: [WD] No Watchdog, try modprobe a watchdog Jan 14 10:47:44 node1 corosync[15062]: [WD] no resources configured. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync watchdog service [7] Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Using quorum provider corosync_votequorum Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Quorum provider: corosync_votequorum failed to initialize. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!' Jan 14 10:47:44 node1 corosync[15062]: [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356. But with IP addresses specified in ringX_addr, everything works: ... nodelist { node { ring0_addr: 104.236.71.79 } node { ring0_addr: 188.166.54.190 } node { ring0_addr: 128.199.116.218 } } ... Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync Cluster Engine ('2.3.3'): started and ready to provide service. Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync built-in features: dbus testagents rdma watchdog augeas pie relro bindnow Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transport (UDP/IP Unicast). Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] The network interface [a.b.c.d] is now up. Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync configuration map access [0] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cmap Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync configuration service [1] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cfg Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cpg Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync profile loading service [4] Jan 14 10:48:28 node1 corosync[15156]: [WD] No Watchdog, try modprobe a watchdog Jan 14 10:48:28 node1 corosync[15156]: [WD] no resources configured. Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync watchdog service [7] Jan 14 10:48:28 node1 corosync[15156]: [QUORUM] Using quorum provider corosync_votequorum Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync vote quorum service v1.0 [5] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: votequorum Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 [3] Jan 14 10:48:28 node1 corosync[15156]: [QB]
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry, Yes, now I have the clear experiment. Sorry, I misinformed you about adding new UDPU member - when I use DNS names in ringX_addr, I don't see This is good to know such messages (for now). But, anyway, DNS names in ringX_addr seem not working, and no relevant messages are in default logs. Maybe add some validations for ringX_addr? I'm having resolvable DNS names: root@node1:/etc/corosync# ping -c1 -W100 node1 | grep from 64 bytes from node1 (127.0.1.1): icmp_seq=1 ttl=64 time=0.039 ms This is problem. Resolving node1 to localhost (127.0.0.1) is simply wrong. Names you want to use in corosync.conf should resolve to interface address. I believe other nodes has similar setting (so node2 resolved on node2 is again 127.0.0.1) Please try to fix this problem first and let's see if this will solve issue you are hitting. Regards, Honza root@node1:/etc/corosync# ping -c1 -W100 node2 | grep from 64 bytes from node2 (188.166.54.190): icmp_seq=1 ttl=55 time=88.3 ms root@node1:/etc/corosync# ping -c1 -W100 node3 | grep from 64 bytes from node3 (128.199.116.218): icmp_seq=1 ttl=51 time=252 ms With corosync.conf below, nothing works: ... nodelist { node { ring0_addr: node1 } node { ring0_addr: node2 } node { ring0_addr: node3 } } ... Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync Cluster Engine ('2.3.3'): started and ready to provide service. Jan 14 10:47:44 node1 corosync[15061]: [MAIN ] Corosync built-in features: dbus testagents rdma watchdog augeas pie relro bindnow Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transport (UDP/IP Unicast). Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 Jan 14 10:47:44 node1 corosync[15062]: [TOTEM ] The network interface [a.b.c.d] is now up. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync configuration map access [0] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cmap Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync configuration service [1] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cfg Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Jan 14 10:47:44 node1 corosync[15062]: [QB] server name: cpg Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync profile loading service [4] Jan 14 10:47:44 node1 corosync[15062]: [WD] No Watchdog, try modprobe a watchdog Jan 14 10:47:44 node1 corosync[15062]: [WD] no resources configured. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine loaded: corosync watchdog service [7] Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Using quorum provider corosync_votequorum Jan 14 10:47:44 node1 corosync[15062]: [QUORUM] Quorum provider: corosync_votequorum failed to initialize. Jan 14 10:47:44 node1 corosync[15062]: [SERV ] Service engine 'corosync_quorum' failed to load for reason 'configuration error: nodelist or quorum.expected_votes must be configured!' Jan 14 10:47:44 node1 corosync[15062]: [MAIN ] Corosync Cluster Engine exiting with status 20 at service.c:356. But with IP addresses specified in ringX_addr, everything works: ... nodelist { node { ring0_addr: 104.236.71.79 } node { ring0_addr: 188.166.54.190 } node { ring0_addr: 128.199.116.218 } } ... Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync Cluster Engine ('2.3.3'): started and ready to provide service. Jan 14 10:48:28 node1 corosync[15155]: [MAIN ] Corosync built-in features: dbus testagents rdma watchdog augeas pie relro bindnow Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transport (UDP/IP Unicast). Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] Initializing transmit/receive security (NSS) crypto: aes256 hash: sha1 Jan 14 10:48:28 node1 corosync[15156]: [TOTEM ] The network interface [a.b.c.d] is now up. Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync configuration map access [0] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cmap Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync configuration service [1] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cfg Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 [2] Jan 14 10:48:28 node1 corosync[15156]: [QB] server name: cpg Jan 14 10:48:28 node1 corosync[15156]: [SERV ] Service engine loaded: corosync profile loading service [4] Jan 14 10:48:28 node1 corosync[15156]: [WD] No Watchdog, try modprobe a watchdog Jan 14 10:48:28 node1 corosync[15156]: [WD] no resources configured. Jan 14 10:48:28 node1 corosync[15156]: [SERV
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry, Sure, in logs I see adding new UDPU member {IP_ADDRESS} (so DNS names are definitely resolved), but in practice the cluster does not work, as I said above. So validations of ringX_addr in corosync.conf would be very helpful in corosync. that's weird. Because as long as DNS is resolved, corosync works only with IP. This means, code path is exactly same with IP or with DNS. Do you have logs from corosync? Honza On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse jfrie...@redhat.com wrote: Dmitry, No, I meant that if you pass a domain name in ring0_addr, there are no errors in logs, corosync even seems to find nodes (based on its logs), And crm_node -l shows them, but in practice nothing really works. A verbose error message would be very helpful in such case. This sounds weird. Are you sure that DNS names really maps to correct IP address? In logs there should be something like adding new UDPU member {IP_ADDRESS}. Regards, Honza On Tuesday, December 30, 2014, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: Dmitry Koterov dmitry.kote...@gmail.com javascript:; writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Sure, in logs I see adding new UDPU member {IP_ADDRESS} (so DNS names are definitely resolved), but in practice the cluster does not work, as I said above. So validations of ringX_addr in corosync.conf would be very helpful in corosync. On Fri, Jan 2, 2015 at 2:49 PM, Jan Friesse jfrie...@redhat.com wrote: Dmitry, No, I meant that if you pass a domain name in ring0_addr, there are no errors in logs, corosync even seems to find nodes (based on its logs), And crm_node -l shows them, but in practice nothing really works. A verbose error message would be very helpful in such case. This sounds weird. Are you sure that DNS names really maps to correct IP address? In logs there should be something like adding new UDPU member {IP_ADDRESS}. Regards, Honza On Tuesday, December 30, 2014, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: Dmitry Koterov dmitry.kote...@gmail.com javascript:; writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry, No, I meant that if you pass a domain name in ring0_addr, there are no errors in logs, corosync even seems to find nodes (based on its logs), And crm_node -l shows them, but in practice nothing really works. A verbose error message would be very helpful in such case. This sounds weird. Are you sure that DNS names really maps to correct IP address? In logs there should be something like adding new UDPU member {IP_ADDRESS}. Regards, Honza On Tuesday, December 30, 2014, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: Dmitry Koterov dmitry.kote...@gmail.com javascript:; writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
No, I meant that if you pass a domain name in ring0_addr, there are no errors in logs, corosync even seems to find nodes (based on its logs), And crm_node -l shows them, but in practice nothing really works. A verbose error message would be very helpful in such case. On Tuesday, December 30, 2014, Daniel Dehennin daniel.dehen...@baby-gnu.org wrote: Dmitry Koterov dmitry.kote...@gmail.com javascript:; writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Dmitry Koterov dmitry.kote...@gmail.com writes: Oh, seems I've found the solution! At least two mistakes was in my corosync.conf (BTW logs did not say about any errors, so my conclusion is based on my experiments only). 1. nodelist.node MUST contain only IP addresses. No hostnames! They simply do not work, crm status shows no nodes. And no warnings are in logs regarding this. You can add name like this: nodelist { node { ring0_addr: public-ip-address-of-the-first-machine name: node1 } node { ring0_addr: public-ip-address-of-the-second-machine name: node2 } } I used it on Ubuntu Trusty with udpu. Regards. -- Daniel Dehennin Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF Fingerprint: 3E69 014E 5C23 50E8 9ED6 2AAD CC1E 9E5B 7A6F E2DF signature.asc Description: PGP signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
On Mon, Dec 29, 2014 at 1:50 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Mon, Dec 29, 2014 at 06:11:49AM +0300, Dmitry Koterov wrote: Hello. I have a geographically distributed cluster, all machines have public IP addresses. No virtual IP subnet exists, so no multicast is available. I thought that UDPu transport can work in such environment, doesn't it? To test everything in advance, I've set up a corosync+pacemaker on Ubuntu 14.04 with the following corosync.conf: totem { transport: udpu interface { ringnumber: 0 bindnetaddr: ip-address-of-the-current-machine mcastport: 5405 } You need to add the member directives too. See corosync.conf(5). Are not member directives for corosync 1.x and nodelist directives for corosync 2.x? Dmitry, which version do you have? Thanks, Dejan } nodelist { node { ring0_addr: node1 } node { ring0_addr: node2 } } ... (here node1 and node2 are hostnames from /etc/hosts on both machines). After running service corosync start; service pacemaker start logs show no problems, but actually both nodes are always offline: root@node1:/etc/corosync# crm status | grep node OFFLINE: [ node1 node2 ] and crm node online (as all other attempts to make crm to do something) are timed out with communication error. No iptables, selinux, apparmor and other bullshit are active: just pure virtual machines with single public IP addresses on each. Also tcpdump shows that UDB packets on port 5405 are going in and out, and if I e.g. stop corosync at node1, the tcpdump output at node2 changes significantly. So they see each other definitely. And if I attach a gvpe adapter to these 2 machines with a private subnet and switch transport to the default one, corosync + pacemaker begin to work. So my question is: what am I doing wrong? Maybe UDPu is not suitable for communications among machines with public IP addresses only? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CoroSync's UDPu transport for public IP addresses?
Hi, On Mon, Dec 29, 2014 at 03:47:16PM +0300, Andrei Borzenkov wrote: On Mon, Dec 29, 2014 at 1:50 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Mon, Dec 29, 2014 at 06:11:49AM +0300, Dmitry Koterov wrote: Hello. I have a geographically distributed cluster, all machines have public IP addresses. No virtual IP subnet exists, so no multicast is available. I thought that UDPu transport can work in such environment, doesn't it? To test everything in advance, I've set up a corosync+pacemaker on Ubuntu 14.04 with the following corosync.conf: totem { transport: udpu interface { ringnumber: 0 bindnetaddr: ip-address-of-the-current-machine mcastport: 5405 } You need to add the member directives too. See corosync.conf(5). Are not member directives for corosync 1.x and nodelist directives for corosync 2.x? Yes, that's right. Looks like my memory's still on 1.x. Thanks, Dejan Dmitry, which version do you have? Thanks, Dejan } nodelist { node { ring0_addr: node1 } node { ring0_addr: node2 } } ... (here node1 and node2 are hostnames from /etc/hosts on both machines). After running service corosync start; service pacemaker start logs show no problems, but actually both nodes are always offline: root@node1:/etc/corosync# crm status | grep node OFFLINE: [ node1 node2 ] and crm node online (as all other attempts to make crm to do something) are timed out with communication error. No iptables, selinux, apparmor and other bullshit are active: just pure virtual machines with single public IP addresses on each. Also tcpdump shows that UDB packets on port 5405 are going in and out, and if I e.g. stop corosync at node1, the tcpdump output at node2 changes significantly. So they see each other definitely. And if I attach a gvpe adapter to these 2 machines with a private subnet and switch transport to the default one, corosync + pacemaker begin to work. So my question is: what am I doing wrong? Maybe UDPu is not suitable for communications among machines with public IP addresses only? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org