So, I set "transport="udpi"' in the cluster.conf file, and it now looks like this:
<cluster config_version="11" name="pgdb_cluster" transport="udpu"> <fence_daemon/> <clusternodes> <clusternode name="csgha1" nodeid="1"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="csgha1"/> </method> </fence> </clusternode> <clusternode name="csgha2" nodeid="2"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="csgha2"/> </method> </fence> </clusternode> <clusternode name="csgha3" nodeid="3"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="csgha3"/> </method> </fence> </clusternode> </clusternodes> <cman/> <fencedevices> <fencedevice agent="fence_pcmk" name="pcmk"/> </fencedevices> <rm> <failoverdomains/> <resources/> </rm> </cluster> But, after restarting the cluster I don't see any difference. Did I do something wrong? -- Jay On Tue, Oct 21, 2014 at 12:25 PM, Digimer <li...@alteeve.ca> wrote: > No, you don't need to specify anything in cluster.conf for unicast to > work. Corosync will divine the IPs by resolving the node names to IPs. If > you set multicast and don't want to use the auto-selected mcast IP, then > you can specify the mcast IP group to use via <multicast... />. > > digimer > > > On 21/10/14 12:22 PM, John Scalia wrote: > >> OK, looking at the cman man page on this system, I see the line saying >> "the corosync.conf file is not used." So, I'm guessing I need to set a >> unicast address somewhere in the cluster.conf file, but the man page >> only mentions the <multicast addr="..."/> parameter. What can I use to >> set this to a unicast address for ports 5404 and 5405? I'm assuming I >> can't just put a unicast address for the multicast parameter, and the >> man page for cluster.conf wasn't much help either. >> >> We're still working on having the security team permit these 3 systems >> to use multicast. >> >> On 10/21/2014 11:51 AM, Digimer wrote: >> >>> Keep us posted. :) >>> >>> On 21/10/14 08:40 AM, John Scalia wrote: >>> >>>> I've been check hostname resolution this morning, and all the systems >>>> are listed in each /etc/hosts file (No DNS in this environment.) and >>>> ping works on every system both to itself and all the other systems. At >>>> least it's working on the 10.10.1.0/24 network. >>>> >>>> I ran tcpdump trying to see what traffic is on port 5405 on each system, >>>> and I'm only seeing outbound on each, even though netstat shows each is >>>> listening on the multicast address. My suspicion is that the router is >>>> eating the multicast broadcasts, so I may try the unicast address >>>> instead, but I'm waiting on one of our network engineers to see if my >>>> suspicion is correct about the router. He volunteered to help late >>>> yesterday. >>>> >>>> On 10/20/2014 4:34 PM, Digimer wrote: >>>> >>>>> It looks sane on the surface. The 'gethostip' tool comes from the >>>>> 'syslinux' package, and it's really handy! The '-d' says to give the >>>>> IP in dotted-decimanl notation only. >>>>> >>>>> What I was trying to see was whether the 'uname -n' resolved to the IP >>>>> on the same network card as the other nodes. This is how corosync >>>>> decides which interface to send cluster traffic onto. I suspect you >>>>> might have a general network issue, possibly related to multicast. >>>>> (Some switches and some hypervisor virtual networks don't play nice >>>>> with corosync). >>>>> >>>>> Have you tried unicast? If not, try setting the <cman ../> element to >>>>> have the <cman transport="udpu" ... /> attribute. Do note that unicast >>>>> isn't as efficient as multicast, so thought it might work, I'd >>>>> personally treat it as a debug tool to isolate the source of the >>>>> problem. >>>>> >>>>> cheers >>>>> >>>>> digimer >>>>> >>>>> PS - Can you share your pacemaker configuration? >>>>> >>>>> On 20/10/14 03:40 PM, John Scalia wrote: >>>>> >>>>>> Sure, and thanks for helping. >>>>>> >>>>>> Here's the /etc/cluster/cluster.conf file and it is identical on all >>>>>> three >>>>>> systems: >>>>>> >>>>>> <cluster config_version="11" name="pgdb_cluster"> >>>>>> <fence_daemon/> >>>>>> <clusternodes> >>>>>> <clusternode name="csgha1" nodeid="1"> >>>>>> <fence> >>>>>> <method name="pcmk-redirect"> >>>>>> <device name="pcmk" port="csgha1"/> >>>>>> </method> >>>>>> </fence> >>>>>> </clusternode> >>>>>> <clusternode name="csgha2" nodeid="2"> >>>>>> <fence> >>>>>> <method name="pcmk-redirect"> >>>>>> <device name="pcmk" port="csgha2"/> >>>>>> </method> >>>>>> </fence> >>>>>> </clusternode> >>>>>> <clusternode name="csgha3" nodeid="3"> >>>>>> <fence> >>>>>> <method name="pcmk-redirect"> >>>>>> <device name="pcmk" port="csgha3"/> >>>>>> </method> >>>>>> </fence> >>>>>> </clusternode> >>>>>> </clusternodes> >>>>>> <cman/> >>>>>> <fencedevices> >>>>>> <fencedevice agent="fence_pcmk" name="pcmk"/> >>>>>> </fencedevices> >>>>>> <rm> >>>>>> <failoverdomains/> >>>>>> <resources/> >>>>>> </rm> >>>>>> </cluster> >>>>>> >>>>>> uname -n reports "csgha1" on that system, "csgha2" on its system, and >>>>>> "csgha3" on the last system. >>>>>> I don't seem to have gethostip on any of these systems, so I don't >>>>>> know if >>>>>> the next section helps or not. >>>>>> "ifconfig -a" reports csgha1: eth0 = 172.17.1.21 >>>>>> eth1 = 10.10.1.128 >>>>>> csgha2: eth0 = 10.10.1.129 >>>>>> Yeah, I know this looks a little weird, but it was the way our >>>>>> automated VM >>>>>> control did the interfaces >>>>>> eth1 = 172.,17.1.3 >>>>>> csgha3: eth0 = 172.17.1.23 >>>>>> eth1 = 10.10.1.130 >>>>>> The /etc/hosts file on each system only has the 10.10.1.0/24 >>>>>> address for >>>>>> each system in in it. >>>>>> iptables is not running on these systems. >>>>>> >>>>>> Let me know if you need more information, and I very much appreciate >>>>>> your >>>>>> assistance. >>>>>> -- >>>>>> Jay >>>>>> >>>>>> On Mon, Oct 20, 2014 at 3:18 PM, Digimer <li...@alteeve.ca> wrote: >>>>>> >>>>>> On 20/10/14 02:50 PM, John Scalia wrote: >>>>>>> >>>>>>> Hi all, >>>>>>>> >>>>>>>> I'm trying to build my first ever HA cluster and I'm using 3 VMs >>>>>>>> running >>>>>>>> CentOS 6.5. I followed the instructions to the letter at: >>>>>>>> >>>>>>>> http://clusterlabs.org/quickstart-redhat.html >>>>>>>> >>>>>>>> and everything appears to start normally, but if I run "cman_tool >>>>>>>> nodes >>>>>>>> -a", I only see: >>>>>>>> >>>>>>>> Node Sts Inc Joined Name >>>>>>>> 1 M 64 2014-10--20 14:00:00 csgha1 >>>>>>>> Addresses: 10.10.1.128 >>>>>>>> 2 X 0 >>>>>>>> csgha2 >>>>>>>> 3 X 0 >>>>>>>> csgha3 >>>>>>>> >>>>>>>> In the other systems, the output is the same except for which >>>>>>>> system is >>>>>>>> shown as joined. Each shows just itself as belonging to the cluster. >>>>>>>> Also, "pcs status" reflects similarly with non-self systems showing >>>>>>>> offline. I've checked "netstat -an" and see each machine >>>>>>>> listening on >>>>>>>> ports 5405 and 5405. And the logs are rather involved, but I'm not >>>>>>>> seeing errors in it. >>>>>>>> >>>>>>>> Any ideas for where to look for what's causing them to not >>>>>>>> communicate? >>>>>>>> -- >>>>>>>> Jay >>>>>>>> >>>>>>>> >>>>>>> Can you share your cluster.conf file please? Also, for each node: >>>>>>> >>>>>>> * uname -n >>>>>>> * gethostip -d $(uname -n) >>>>>>> * ifconfig |grep -B 1 $(gethostip -d $(uname -n)) | grep HWaddr | >>>>>>> awk '{ >>>>>>> print $1 }' >>>>>>> * iptables-save | grep -i multi >>>>>>> >>>>>>> -- >>>>>>> Digimer >>>>>>> Papers and Projects: https://alteeve.ca/w/ >>>>>>> What if the cure for cancer is trapped in the mind of a person >>>>>>> without >>>>>>> access to education? >>>>>>> _______________________________________________ >>>>>>> Linux-HA mailing list >>>>>>> Linux-HA@lists.linux-ha.org >>>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>>> >>>>>>> _______________________________________________ >>>>>> Linux-HA mailing list >>>>>> Linux-HA@lists.linux-ha.org >>>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>>>> See also: http://linux-ha.org/ReportingProblems >>>>>> >>>>>> >>>>> >>>>> >>>> _______________________________________________ >>>> Linux-HA mailing list >>>> Linux-HA@lists.linux-ha.org >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>>> See also: http://linux-ha.org/ReportingProblems >>>> >>> >>> >>> >> _______________________________________________ >> Linux-HA mailing list >> Linux-HA@lists.linux-ha.org >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > > -- > Digimer > Papers and Projects: https://alteeve.ca/w/ > What if the cure for cancer is trapped in the mind of a person without > access to education? > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems