Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread John Scalia
I've been check hostname resolution this morning, and all the systems are listed in each /etc/hosts file (No DNS in this environment.) and ping works on every system both to itself 
and all the other systems. At least it's working on the 10.10.1.0/24 network.


I ran tcpdump trying to see what traffic is on port 5405 on each system, and I'm only seeing outbound on each, even though netstat shows each is listening on the multicast address. 
My suspicion is that the router is eating the multicast broadcasts, so I may try the unicast address instead, but I'm waiting on one of our network engineers to see if my suspicion 
is correct about the router. He volunteered to help late yesterday.


On 10/20/2014 4:34 PM, Digimer wrote:

It looks sane on the surface. The 'gethostip' tool comes from the 'syslinux' 
package, and it's really handy! The '-d' says to give the IP in dotted-decimanl 
notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP on the same network card as the other nodes. This is how corosync decides which interface to send cluster 
traffic onto. I suspect you might have a general network issue, possibly related to multicast. (Some switches and some hypervisor virtual networks don't play nice with corosync).


Have you tried unicast? If not, try setting the cman ../ element to have the cman transport=udpu ... / attribute. Do note that unicast isn't as efficient as multicast, so 
thought it might work, I'd personally treat it as a debug tool to isolate the source of the problem.


cheers

digimer

PS - Can you share your pacemaker configuration?

On 20/10/14 03:40 PM, John Scalia wrote:

Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all three
systems:

cluster config_version=11 name=pgdb_cluster
   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
/cluster

uname -n reports csgha1 on that system, csgha2 on its system, and
csgha3 on the last system.
I don't seem to have gethostip on any of these systems, so I don't know if
the next section helps or not.
ifconfig -a reports csgha1: eth0 = 172.17.1.21
  eth1 = 10.10.1.128
 csgha2: eth0 = 10.10.1.129
Yeah, I know this looks a little weird, but it was the way our automated VM
control did the interfaces
  eth1 = 172.,17.1.3
 csgha3: eth0 = 172.17.1.23
  eth1 = 10.10.1.130
The /etc/hosts file on each system only has the 10.10.1.0/24 address for
each system in in it.
iptables is not running on these systems.

Let me know if you need more information, and I very much appreciate your
assistance.
--
Jay

On Mon, Oct 20, 2014 at 3:18 PM, Digimer li...@alteeve.ca wrote:


On 20/10/14 02:50 PM, John Scalia wrote:


Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool nodes
-a, I only see:

Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00 csgha1
  Addresses: 10.10.1.128
  2  X 0
csgha2
  3  X 0
csgha3

In the other systems, the output is the same except for which system is
shown as joined. Each shows just itself as belonging to the cluster.
Also, pcs status reflects similarly with non-self systems showing
offline. I've checked netstat -an and see each machine listening on
ports 5405 and 5405. And the logs are rather involved, but I'm not
seeing errors in it.

Any ideas for where to look for what's causing them to not communicate?
--
Jay



Can you share your cluster.conf file please? Also, for each node:

* uname -n
* gethostip -d $(uname -n)
* ifconfig |grep -B 1 $(gethostip -d $(uname -n)) | grep HWaddr | awk '{
print $1 }'
* iptables-save | grep -i multi

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Digimer

Keep us posted. :)

On 21/10/14 08:40 AM, John Scalia wrote:

I've been check hostname resolution this morning, and all the systems
are listed in each /etc/hosts file (No DNS in this environment.) and
ping works on every system both to itself and all the other systems. At
least it's working on the 10.10.1.0/24 network.

I ran tcpdump trying to see what traffic is on port 5405 on each system,
and I'm only seeing outbound on each, even though netstat shows each is
listening on the multicast address. My suspicion is that the router is
eating the multicast broadcasts, so I may try the unicast address
instead, but I'm waiting on one of our network engineers to see if my
suspicion is correct about the router. He volunteered to help late
yesterday.

On 10/20/2014 4:34 PM, Digimer wrote:

It looks sane on the surface. The 'gethostip' tool comes from the
'syslinux' package, and it's really handy! The '-d' says to give the
IP in dotted-decimanl notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP
on the same network card as the other nodes. This is how corosync
decides which interface to send cluster traffic onto. I suspect you
might have a general network issue, possibly related to multicast.
(Some switches and some hypervisor virtual networks don't play nice
with corosync).

Have you tried unicast? If not, try setting the cman ../ element to
have the cman transport=udpu ... / attribute. Do note that unicast
isn't as efficient as multicast, so thought it might work, I'd
personally treat it as a debug tool to isolate the source of the problem.

cheers

digimer

PS - Can you share your pacemaker configuration?

On 20/10/14 03:40 PM, John Scalia wrote:

Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all
three
systems:

cluster config_version=11 name=pgdb_cluster
   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
/cluster

uname -n reports csgha1 on that system, csgha2 on its system, and
csgha3 on the last system.
I don't seem to have gethostip on any of these systems, so I don't
know if
the next section helps or not.
ifconfig -a reports csgha1: eth0 = 172.17.1.21
  eth1 = 10.10.1.128
 csgha2: eth0 = 10.10.1.129
Yeah, I know this looks a little weird, but it was the way our
automated VM
control did the interfaces
  eth1 = 172.,17.1.3
 csgha3: eth0 = 172.17.1.23
  eth1 = 10.10.1.130
The /etc/hosts file on each system only has the 10.10.1.0/24 address for
each system in in it.
iptables is not running on these systems.

Let me know if you need more information, and I very much appreciate
your
assistance.
--
Jay

On Mon, Oct 20, 2014 at 3:18 PM, Digimer li...@alteeve.ca wrote:


On 20/10/14 02:50 PM, John Scalia wrote:


Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs
running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool
nodes
-a, I only see:

Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00 csgha1
  Addresses: 10.10.1.128
  2  X 0
csgha2
  3  X 0
csgha3

In the other systems, the output is the same except for which
system is
shown as joined. Each shows just itself as belonging to the cluster.
Also, pcs status reflects similarly with non-self systems showing
offline. I've checked netstat -an and see each machine listening on
ports 5405 and 5405. And the logs are rather involved, but I'm not
seeing errors in it.

Any ideas for where to look for what's causing them to not
communicate?
--
Jay



Can you share your cluster.conf file please? Also, for each node:

* uname -n
* gethostip -d $(uname -n)
* ifconfig |grep -B 1 $(gethostip -d $(uname -n)) | grep HWaddr |
awk '{
print $1 }'
* iptables-save | grep -i multi

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread John Scalia
OK, looking at the cman man page on this system, I see the line saying the corosync.conf file is not used. So, I'm guessing I need to set a unicast address somewhere in the 
cluster.conf file, but the man page only mentions the multicast addr=.../ parameter. What can I use to set this to a unicast address for ports 5404 and 5405? I'm assuming I 
can't just put a unicast address for the multicast parameter, and the man page for cluster.conf wasn't much help either.


We're still working on having the security team permit these 3 systems to use 
multicast.

On 10/21/2014 11:51 AM, Digimer wrote:

Keep us posted. :)

On 21/10/14 08:40 AM, John Scalia wrote:

I've been check hostname resolution this morning, and all the systems
are listed in each /etc/hosts file (No DNS in this environment.) and
ping works on every system both to itself and all the other systems. At
least it's working on the 10.10.1.0/24 network.

I ran tcpdump trying to see what traffic is on port 5405 on each system,
and I'm only seeing outbound on each, even though netstat shows each is
listening on the multicast address. My suspicion is that the router is
eating the multicast broadcasts, so I may try the unicast address
instead, but I'm waiting on one of our network engineers to see if my
suspicion is correct about the router. He volunteered to help late
yesterday.

On 10/20/2014 4:34 PM, Digimer wrote:

It looks sane on the surface. The 'gethostip' tool comes from the
'syslinux' package, and it's really handy! The '-d' says to give the
IP in dotted-decimanl notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP
on the same network card as the other nodes. This is how corosync
decides which interface to send cluster traffic onto. I suspect you
might have a general network issue, possibly related to multicast.
(Some switches and some hypervisor virtual networks don't play nice
with corosync).

Have you tried unicast? If not, try setting the cman ../ element to
have the cman transport=udpu ... / attribute. Do note that unicast
isn't as efficient as multicast, so thought it might work, I'd
personally treat it as a debug tool to isolate the source of the problem.

cheers

digimer

PS - Can you share your pacemaker configuration?

On 20/10/14 03:40 PM, John Scalia wrote:

Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all
three
systems:

cluster config_version=11 name=pgdb_cluster
   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
/cluster

uname -n reports csgha1 on that system, csgha2 on its system, and
csgha3 on the last system.
I don't seem to have gethostip on any of these systems, so I don't
know if
the next section helps or not.
ifconfig -a reports csgha1: eth0 = 172.17.1.21
  eth1 = 10.10.1.128
 csgha2: eth0 = 10.10.1.129
Yeah, I know this looks a little weird, but it was the way our
automated VM
control did the interfaces
  eth1 = 172.,17.1.3
 csgha3: eth0 = 172.17.1.23
  eth1 = 10.10.1.130
The /etc/hosts file on each system only has the 10.10.1.0/24 address for
each system in in it.
iptables is not running on these systems.

Let me know if you need more information, and I very much appreciate
your
assistance.
--
Jay

On Mon, Oct 20, 2014 at 3:18 PM, Digimer li...@alteeve.ca wrote:


On 20/10/14 02:50 PM, John Scalia wrote:


Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs
running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool
nodes
-a, I only see:

Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00 csgha1
  Addresses: 10.10.1.128
  2  X 0
csgha2
  3  X 0
csgha3

In the other systems, the output is the same except for which
system is
shown as joined. Each shows just itself as belonging to the cluster.
Also, pcs status reflects similarly with non-self systems showing
offline. I've checked netstat -an and see each machine listening on
ports 5405 and 5405. And the logs are rather involved, but I'm not

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Digimer
No, you don't need to specify anything in cluster.conf for unicast to 
work. Corosync will divine the IPs by resolving the node names to IPs. 
If you set multicast and don't want to use the auto-selected mcast IP, 
then you can specify the mcast IP group to use via multicast... /.


digimer

On 21/10/14 12:22 PM, John Scalia wrote:

OK, looking at the cman man page on this system, I see the line saying
the corosync.conf file is not used. So, I'm guessing I need to set a
unicast address somewhere in the cluster.conf file, but the man page
only mentions the multicast addr=.../ parameter. What can I use to
set this to a unicast address for ports 5404 and 5405? I'm assuming I
can't just put a unicast address for the multicast parameter, and the
man page for cluster.conf wasn't much help either.

We're still working on having the security team permit these 3 systems
to use multicast.

On 10/21/2014 11:51 AM, Digimer wrote:

Keep us posted. :)

On 21/10/14 08:40 AM, John Scalia wrote:

I've been check hostname resolution this morning, and all the systems
are listed in each /etc/hosts file (No DNS in this environment.) and
ping works on every system both to itself and all the other systems. At
least it's working on the 10.10.1.0/24 network.

I ran tcpdump trying to see what traffic is on port 5405 on each system,
and I'm only seeing outbound on each, even though netstat shows each is
listening on the multicast address. My suspicion is that the router is
eating the multicast broadcasts, so I may try the unicast address
instead, but I'm waiting on one of our network engineers to see if my
suspicion is correct about the router. He volunteered to help late
yesterday.

On 10/20/2014 4:34 PM, Digimer wrote:

It looks sane on the surface. The 'gethostip' tool comes from the
'syslinux' package, and it's really handy! The '-d' says to give the
IP in dotted-decimanl notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP
on the same network card as the other nodes. This is how corosync
decides which interface to send cluster traffic onto. I suspect you
might have a general network issue, possibly related to multicast.
(Some switches and some hypervisor virtual networks don't play nice
with corosync).

Have you tried unicast? If not, try setting the cman ../ element to
have the cman transport=udpu ... / attribute. Do note that unicast
isn't as efficient as multicast, so thought it might work, I'd
personally treat it as a debug tool to isolate the source of the
problem.

cheers

digimer

PS - Can you share your pacemaker configuration?

On 20/10/14 03:40 PM, John Scalia wrote:

Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all
three
systems:

cluster config_version=11 name=pgdb_cluster
   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
/cluster

uname -n reports csgha1 on that system, csgha2 on its system, and
csgha3 on the last system.
I don't seem to have gethostip on any of these systems, so I don't
know if
the next section helps or not.
ifconfig -a reports csgha1: eth0 = 172.17.1.21
  eth1 = 10.10.1.128
 csgha2: eth0 = 10.10.1.129
Yeah, I know this looks a little weird, but it was the way our
automated VM
control did the interfaces
  eth1 = 172.,17.1.3
 csgha3: eth0 = 172.17.1.23
  eth1 = 10.10.1.130
The /etc/hosts file on each system only has the 10.10.1.0/24
address for
each system in in it.
iptables is not running on these systems.

Let me know if you need more information, and I very much appreciate
your
assistance.
--
Jay

On Mon, Oct 20, 2014 at 3:18 PM, Digimer li...@alteeve.ca wrote:


On 20/10/14 02:50 PM, John Scalia wrote:


Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs
running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool
nodes
-a, I only see:

Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00 csgha1
  Addresses: 10.10.1.128
  2  X 0
csgha2
  3  X 0
csgha3

In the other 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread John Scalia
So, I set transport=udpi' in the cluster.conf file, and it now looks
like this:

cluster config_version=11 name=pgdb_cluster transport=udpu
  fence_daemon/
  clusternodes
clusternode name=csgha1 nodeid=1
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha1/
/method
  /fence
/clusternode
clusternode name=csgha2 nodeid=2
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha2/
/method
  /fence
/clusternode
clusternode name=csgha3 nodeid=3
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha3/
/method
  /fence
/clusternode
  /clusternodes
  cman/
  fencedevices
fencedevice agent=fence_pcmk name=pcmk/
  /fencedevices
  rm
failoverdomains/
resources/
  /rm
/cluster

But, after restarting the cluster I don't see any difference. Did I do
something wrong?
--
Jay

On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:

 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. If
 you set multicast and don't want to use the auto-selected mcast IP, then
 you can specify the mcast IP group to use via multicast... /.

 digimer


 On 21/10/14 12:22 PM, John Scalia wrote:

 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.

 We're still working on having the security team permit these 3 systems
 to use multicast.

 On 10/21/2014 11:51 AM, Digimer wrote:

 Keep us posted. :)

 On 21/10/14 08:40 AM, John Scalia wrote:

 I've been check hostname resolution this morning, and all the systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other systems. At
 least it's working on the 10.10.1.0/24 network.

 I ran tcpdump trying to see what traffic is on port 5405 on each system,
 and I'm only seeing outbound on each, even though netstat shows each is
 listening on the multicast address. My suspicion is that the router is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if my
 suspicion is correct about the router. He volunteered to help late
 yesterday.

 On 10/20/2014 4:34 PM, Digimer wrote:

 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give the
 IP in dotted-decimanl notation only.

 What I was trying to see was whether the 'uname -n' resolved to the IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I suspect you
 might have a general network issue, possibly related to multicast.
 (Some switches and some hypervisor virtual networks don't play nice
 with corosync).

 Have you tried unicast? If not, try setting the cman ../ element to
 have the cman transport=udpu ... / attribute. Do note that unicast
 isn't as efficient as multicast, so thought it might work, I'd
 personally treat it as a debug tool to isolate the source of the
 problem.

 cheers

 digimer

 PS - Can you share your pacemaker configuration?

 On 20/10/14 03:40 PM, John Scalia wrote:

 Sure, and thanks for helping.

 Here's the /etc/cluster/cluster.conf file and it is identical on all
 three
 systems:

 cluster config_version=11 name=pgdb_cluster
fence_daemon/
clusternodes
  clusternode name=csgha1 nodeid=1
fence
  method name=pcmk-redirect
device name=pcmk port=csgha1/
  /method
/fence
  /clusternode
  clusternode name=csgha2 nodeid=2
fence
  method name=pcmk-redirect
device name=pcmk port=csgha2/
  /method
/fence
  /clusternode
  clusternode name=csgha3 nodeid=3
fence
  method name=pcmk-redirect
device name=pcmk port=csgha3/
  /method
/fence
  /clusternode
/clusternodes
cman/
fencedevices
  fencedevice agent=fence_pcmk name=pcmk/
/fencedevices
rm
  failoverdomains/
  resources/
/rm
 /cluster

 uname -n reports csgha1 on that system, csgha2 on its system, and
 csgha3 on the last system.
 I don't seem to have gethostip on any of these systems, so I don't
 know if
 the next section helps or not.
 ifconfig -a reports csgha1: eth0 = 172.17.1.21
   eth1 = 10.10.1.128
  csgha2: eth0 = 10.10.1.129
 Yeah, I know this looks 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread John Scalia
Ok, got it working after a little more effort, and the cluster is now
properly reporting.

On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com wrote:

 So, I set transport=udpi' in the cluster.conf file, and it now looks
 like this:

 cluster config_version=11 name=pgdb_cluster transport=udpu

   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
 /cluster

 But, after restarting the cluster I don't see any difference. Did I do
 something wrong?
 --
 Jay

 On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:

 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. If
 you set multicast and don't want to use the auto-selected mcast IP, then
 you can specify the mcast IP group to use via multicast... /.

 digimer


 On 21/10/14 12:22 PM, John Scalia wrote:

 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.

 We're still working on having the security team permit these 3 systems
 to use multicast.

 On 10/21/2014 11:51 AM, Digimer wrote:

 Keep us posted. :)

 On 21/10/14 08:40 AM, John Scalia wrote:

 I've been check hostname resolution this morning, and all the systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other systems. At
 least it's working on the 10.10.1.0/24 network.

 I ran tcpdump trying to see what traffic is on port 5405 on each
 system,
 and I'm only seeing outbound on each, even though netstat shows each is
 listening on the multicast address. My suspicion is that the router is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if my
 suspicion is correct about the router. He volunteered to help late
 yesterday.

 On 10/20/2014 4:34 PM, Digimer wrote:

 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give the
 IP in dotted-decimanl notation only.

 What I was trying to see was whether the 'uname -n' resolved to the IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I suspect you
 might have a general network issue, possibly related to multicast.
 (Some switches and some hypervisor virtual networks don't play nice
 with corosync).

 Have you tried unicast? If not, try setting the cman ../ element to
 have the cman transport=udpu ... / attribute. Do note that unicast
 isn't as efficient as multicast, so thought it might work, I'd
 personally treat it as a debug tool to isolate the source of the
 problem.

 cheers

 digimer

 PS - Can you share your pacemaker configuration?

 On 20/10/14 03:40 PM, John Scalia wrote:

 Sure, and thanks for helping.

 Here's the /etc/cluster/cluster.conf file and it is identical on all
 three
 systems:

 cluster config_version=11 name=pgdb_cluster
fence_daemon/
clusternodes
  clusternode name=csgha1 nodeid=1
fence
  method name=pcmk-redirect
device name=pcmk port=csgha1/
  /method
/fence
  /clusternode
  clusternode name=csgha2 nodeid=2
fence
  method name=pcmk-redirect
device name=pcmk port=csgha2/
  /method
/fence
  /clusternode
  clusternode name=csgha3 nodeid=3
fence
  method name=pcmk-redirect
device name=pcmk port=csgha3/
  /method
/fence
  /clusternode
/clusternodes
cman/
fencedevices
  fencedevice agent=fence_pcmk name=pcmk/
/fencedevices
rm
  failoverdomains/
  resources/
/rm
 /cluster

 uname -n reports csgha1 on that system, csgha2 on its system, and
 csgha3 on the last system.
 I don't seem to have gethostip on any of these systems, so I don't
 know if
 the next 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Digimer

Glad you sorted it out!

So then, it was almost certainly a multicast issue. I would still 
strongly recommend trying to source and fix the problem, and reverting 
to mcast if you can. More efficient. :)


digimer

On 21/10/14 02:59 PM, John Scalia wrote:

Ok, got it working after a little more effort, and the cluster is now
properly reporting.

On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com wrote:


So, I set transport=udpi' in the cluster.conf file, and it now looks
like this:

cluster config_version=11 name=pgdb_cluster transport=udpu

   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
/cluster

But, after restarting the cluster I don't see any difference. Did I do
something wrong?
--
Jay

On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:


No, you don't need to specify anything in cluster.conf for unicast to
work. Corosync will divine the IPs by resolving the node names to IPs. If
you set multicast and don't want to use the auto-selected mcast IP, then
you can specify the mcast IP group to use via multicast... /.

digimer


On 21/10/14 12:22 PM, John Scalia wrote:


OK, looking at the cman man page on this system, I see the line saying
the corosync.conf file is not used. So, I'm guessing I need to set a
unicast address somewhere in the cluster.conf file, but the man page
only mentions the multicast addr=.../ parameter. What can I use to
set this to a unicast address for ports 5404 and 5405? I'm assuming I
can't just put a unicast address for the multicast parameter, and the
man page for cluster.conf wasn't much help either.

We're still working on having the security team permit these 3 systems
to use multicast.

On 10/21/2014 11:51 AM, Digimer wrote:


Keep us posted. :)

On 21/10/14 08:40 AM, John Scalia wrote:


I've been check hostname resolution this morning, and all the systems
are listed in each /etc/hosts file (No DNS in this environment.) and
ping works on every system both to itself and all the other systems. At
least it's working on the 10.10.1.0/24 network.

I ran tcpdump trying to see what traffic is on port 5405 on each
system,
and I'm only seeing outbound on each, even though netstat shows each is
listening on the multicast address. My suspicion is that the router is
eating the multicast broadcasts, so I may try the unicast address
instead, but I'm waiting on one of our network engineers to see if my
suspicion is correct about the router. He volunteered to help late
yesterday.

On 10/20/2014 4:34 PM, Digimer wrote:


It looks sane on the surface. The 'gethostip' tool comes from the
'syslinux' package, and it's really handy! The '-d' says to give the
IP in dotted-decimanl notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP
on the same network card as the other nodes. This is how corosync
decides which interface to send cluster traffic onto. I suspect you
might have a general network issue, possibly related to multicast.
(Some switches and some hypervisor virtual networks don't play nice
with corosync).

Have you tried unicast? If not, try setting the cman ../ element to
have the cman transport=udpu ... / attribute. Do note that unicast
isn't as efficient as multicast, so thought it might work, I'd
personally treat it as a debug tool to isolate the source of the
problem.

cheers

digimer

PS - Can you share your pacemaker configuration?

On 20/10/14 03:40 PM, John Scalia wrote:


Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all
three
systems:

cluster config_version=11 name=pgdb_cluster
fence_daemon/
clusternodes
  clusternode name=csgha1 nodeid=1
fence
  method name=pcmk-redirect
device name=pcmk port=csgha1/
  /method
/fence
  /clusternode
  clusternode name=csgha2 nodeid=2
fence
  method name=pcmk-redirect
device name=pcmk port=csgha2/
  /method
/fence
  /clusternode
  clusternode name=csgha3 nodeid=3
fence
  method name=pcmk-redirect
device name=pcmk port=csgha3/
  /method
/fence
  /clusternode
/clusternodes
cman/
fencedevices
  fencedevice agent=fence_pcmk name=pcmk/
/fencedevices
rm
  failoverdomains/
  resources/
   

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread jayknowsunix
Yep, my network engineer and I found that the multicast packets were being 
blocked by the underlying hypervisor for the VM systems. At first we thought it 
was just iptables on the servers, but i was certain I had actually turned that 
off. The issue has been bumped up to the operations team for a fixing this, but 
since I've gotten it to work with unicast, there's no pressure

Sent from my iPad

 On Oct 21, 2014, at 3:15 PM, Digimer li...@alteeve.ca wrote:
 
 Glad you sorted it out!
 
 So then, it was almost certainly a multicast issue. I would still strongly 
 recommend trying to source and fix the problem, and reverting to mcast if you 
 can. More efficient. :)
 
 digimer
 
 On 21/10/14 02:59 PM, John Scalia wrote:
 Ok, got it working after a little more effort, and the cluster is now
 properly reporting.
 
 On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com wrote:
 
 So, I set transport=udpi' in the cluster.conf file, and it now looks
 like this:
 
 cluster config_version=11 name=pgdb_cluster transport=udpu
 
   fence_daemon/
   clusternodes
 clusternode name=csgha1 nodeid=1
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha1/
 /method
   /fence
 /clusternode
 clusternode name=csgha2 nodeid=2
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha2/
 /method
   /fence
 /clusternode
 clusternode name=csgha3 nodeid=3
   fence
 method name=pcmk-redirect
   device name=pcmk port=csgha3/
 /method
   /fence
 /clusternode
   /clusternodes
   cman/
   fencedevices
 fencedevice agent=fence_pcmk name=pcmk/
   /fencedevices
   rm
 failoverdomains/
 resources/
   /rm
 /cluster
 
 But, after restarting the cluster I don't see any difference. Did I do
 something wrong?
 --
 Jay
 
 On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:
 
 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. If
 you set multicast and don't want to use the auto-selected mcast IP, then
 you can specify the mcast IP group to use via multicast... /.
 
 digimer
 
 
 On 21/10/14 12:22 PM, John Scalia wrote:
 
 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.
 
 We're still working on having the security team permit these 3 systems
 to use multicast.
 
 On 10/21/2014 11:51 AM, Digimer wrote:
 
 Keep us posted. :)
 
 On 21/10/14 08:40 AM, John Scalia wrote:
 
 I've been check hostname resolution this morning, and all the systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other systems. At
 least it's working on the 10.10.1.0/24 network.
 
 I ran tcpdump trying to see what traffic is on port 5405 on each
 system,
 and I'm only seeing outbound on each, even though netstat shows each is
 listening on the multicast address. My suspicion is that the router is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if my
 suspicion is correct about the router. He volunteered to help late
 yesterday.
 
 On 10/20/2014 4:34 PM, Digimer wrote:
 
 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give the
 IP in dotted-decimanl notation only.
 
 What I was trying to see was whether the 'uname -n' resolved to the IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I suspect you
 might have a general network issue, possibly related to multicast.
 (Some switches and some hypervisor virtual networks don't play nice
 with corosync).
 
 Have you tried unicast? If not, try setting the cman ../ element to
 have the cman transport=udpu ... / attribute. Do note that unicast
 isn't as efficient as multicast, so thought it might work, I'd
 personally treat it as a debug tool to isolate the source of the
 problem.
 
 cheers
 
 digimer
 
 PS - Can you share your pacemaker configuration?
 
 On 20/10/14 03:40 PM, John Scalia wrote:
 
 Sure, and thanks for helping.
 
 Here's the /etc/cluster/cluster.conf file and it is identical on all
 three
 systems:
 
 cluster config_version=11 name=pgdb_cluster
fence_daemon/
clusternodes
  clusternode name=csgha1 nodeid=1
fence
  method name=pcmk-redirect
device name=pcmk port=csgha1/
  /method
 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Andrew Beekhof

 On 22 Oct 2014, at 7:36 am, jayknowsu...@gmail.com wrote:
 
 Yep, my network engineer and I found that the multicast packets were being 
 blocked by the underlying hypervisor for the VM systems.

Yeah, that'll happen :-(
I believe its fixed in newer kernels, but for a while there multicast would 
appear to work and then stop for no good reason.
Putting the device into promiscuous mode seemed to help IIRC.

This is the bug I knew it as: 
https://bugzilla.redhat.com/show_bug.cgi?id=1090670



 At first we thought it was just iptables on the servers, but i was certain I 
 had actually turned that off. The issue has been bumped up to the operations 
 team for a fixing this, but since I've gotten it to work with unicast, 
 there's no pressure
 
 Sent from my iPad
 
 On Oct 21, 2014, at 3:15 PM, Digimer li...@alteeve.ca wrote:
 
 Glad you sorted it out!
 
 So then, it was almost certainly a multicast issue. I would still strongly 
 recommend trying to source and fix the problem, and reverting to mcast if 
 you can. More efficient. :)
 
 digimer
 
 On 21/10/14 02:59 PM, John Scalia wrote:
 Ok, got it working after a little more effort, and the cluster is now
 properly reporting.
 
 On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com 
 wrote:
 
 So, I set transport=udpi' in the cluster.conf file, and it now looks
 like this:
 
 cluster config_version=11 name=pgdb_cluster transport=udpu
 
  fence_daemon/
  clusternodes
clusternode name=csgha1 nodeid=1
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha1/
/method
  /fence
/clusternode
clusternode name=csgha2 nodeid=2
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha2/
/method
  /fence
/clusternode
clusternode name=csgha3 nodeid=3
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha3/
/method
  /fence
/clusternode
  /clusternodes
  cman/
  fencedevices
fencedevice agent=fence_pcmk name=pcmk/
  /fencedevices
  rm
failoverdomains/
resources/
  /rm
 /cluster
 
 But, after restarting the cluster I don't see any difference. Did I do
 something wrong?
 --
 Jay
 
 On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:
 
 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. If
 you set multicast and don't want to use the auto-selected mcast IP, then
 you can specify the mcast IP group to use via multicast... /.
 
 digimer
 
 
 On 21/10/14 12:22 PM, John Scalia wrote:
 
 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.
 
 We're still working on having the security team permit these 3 systems
 to use multicast.
 
 On 10/21/2014 11:51 AM, Digimer wrote:
 
 Keep us posted. :)
 
 On 21/10/14 08:40 AM, John Scalia wrote:
 
 I've been check hostname resolution this morning, and all the systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other systems. At
 least it's working on the 10.10.1.0/24 network.
 
 I ran tcpdump trying to see what traffic is on port 5405 on each
 system,
 and I'm only seeing outbound on each, even though netstat shows each is
 listening on the multicast address. My suspicion is that the router is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if my
 suspicion is correct about the router. He volunteered to help late
 yesterday.
 
 On 10/20/2014 4:34 PM, Digimer wrote:
 
 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give the
 IP in dotted-decimanl notation only.
 
 What I was trying to see was whether the 'uname -n' resolved to the IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I suspect you
 might have a general network issue, possibly related to multicast.
 (Some switches and some hypervisor virtual networks don't play nice
 with corosync).
 
 Have you tried unicast? If not, try setting the cman ../ element to
 have the cman transport=udpu ... / attribute. Do note that unicast
 isn't as efficient as multicast, so thought it might work, I'd
 personally treat it as a debug tool to isolate the source of the
 problem.
 
 cheers
 
 digimer
 
 PS - Can you share your pacemaker configuration?
 
 On 20/10/14 03:40 PM, John Scalia wrote:
 
 Sure, 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread jayknowsunix
Sure! But i can't seem to get Redhat to let me see the bug, even though I have 
an account.

Sent from my iPad

 On Oct 21, 2014, at 5:51 PM, Andrew Beekhof and...@beekhof.net wrote:
 
 
 On 22 Oct 2014, at 7:36 am, jayknowsu...@gmail.com wrote:
 
 Yep, my network engineer and I found that the multicast packets were being 
 blocked by the underlying hypervisor for the VM systems.
 
 Yeah, that'll happen :-(
 I believe its fixed in newer kernels, but for a while there multicast would 
 appear to work and then stop for no good reason.
 Putting the device into promiscuous mode seemed to help IIRC.
 
 This is the bug I knew it as: 
 https://bugzilla.redhat.com/show_bug.cgi?id=1090670
 
 
 
 At first we thought it was just iptables on the servers, but i was certain I 
 had actually turned that off. The issue has been bumped up to the operations 
 team for a fixing this, but since I've gotten it to work with unicast, 
 there's no pressure
 
 Sent from my iPad
 
 On Oct 21, 2014, at 3:15 PM, Digimer li...@alteeve.ca wrote:
 
 Glad you sorted it out!
 
 So then, it was almost certainly a multicast issue. I would still strongly 
 recommend trying to source and fix the problem, and reverting to mcast if 
 you can. More efficient. :)
 
 digimer
 
 On 21/10/14 02:59 PM, John Scalia wrote:
 Ok, got it working after a little more effort, and the cluster is now
 properly reporting.
 
 On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com 
 wrote:
 
 So, I set transport=udpi' in the cluster.conf file, and it now looks
 like this:
 
 cluster config_version=11 name=pgdb_cluster transport=udpu
 
 fence_daemon/
 clusternodes
   clusternode name=csgha1 nodeid=1
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha1/
   /method
 /fence
   /clusternode
   clusternode name=csgha2 nodeid=2
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha2/
   /method
 /fence
   /clusternode
   clusternode name=csgha3 nodeid=3
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha3/
   /method
 /fence
   /clusternode
 /clusternodes
 cman/
 fencedevices
   fencedevice agent=fence_pcmk name=pcmk/
 /fencedevices
 rm
   failoverdomains/
   resources/
 /rm
 /cluster
 
 But, after restarting the cluster I don't see any difference. Did I do
 something wrong?
 --
 Jay
 
 On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:
 
 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. If
 you set multicast and don't want to use the auto-selected mcast IP, then
 you can specify the mcast IP group to use via multicast... /.
 
 digimer
 
 
 On 21/10/14 12:22 PM, John Scalia wrote:
 
 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.
 
 We're still working on having the security team permit these 3 systems
 to use multicast.
 
 On 10/21/2014 11:51 AM, Digimer wrote:
 
 Keep us posted. :)
 
 On 21/10/14 08:40 AM, John Scalia wrote:
 
 I've been check hostname resolution this morning, and all the systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other systems. 
 At
 least it's working on the 10.10.1.0/24 network.
 
 I ran tcpdump trying to see what traffic is on port 5405 on each
 system,
 and I'm only seeing outbound on each, even though netstat shows each 
 is
 listening on the multicast address. My suspicion is that the router is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if my
 suspicion is correct about the router. He volunteered to help late
 yesterday.
 
 On 10/20/2014 4:34 PM, Digimer wrote:
 
 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give the
 IP in dotted-decimanl notation only.
 
 What I was trying to see was whether the 'uname -n' resolved to the 
 IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I suspect you
 might have a general network issue, possibly related to multicast.
 (Some switches and some hypervisor virtual networks don't play nice
 with corosync).
 
 Have you tried unicast? If not, try setting the cman ../ element to
 have the cman transport=udpu ... / attribute. Do note that 
 unicast
 isn't as efficient as multicast, so thought it might work, I'd
 personally treat it as a debug 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Digimer

Blocked for me, too. Possible to clone - client data?

On 21/10/14 06:14 PM, jayknowsu...@gmail.com wrote:

Sure! But i can't seem to get Redhat to let me see the bug, even though I have 
an account.

Sent from my iPad


On Oct 21, 2014, at 5:51 PM, Andrew Beekhof and...@beekhof.net wrote:



On 22 Oct 2014, at 7:36 am, jayknowsu...@gmail.com wrote:

Yep, my network engineer and I found that the multicast packets were being 
blocked by the underlying hypervisor for the VM systems.


Yeah, that'll happen :-(
I believe its fixed in newer kernels, but for a while there multicast would 
appear to work and then stop for no good reason.
Putting the device into promiscuous mode seemed to help IIRC.

This is the bug I knew it as: 
https://bugzilla.redhat.com/show_bug.cgi?id=1090670




At first we thought it was just iptables on the servers, but i was certain I 
had actually turned that off. The issue has been bumped up to the operations 
team for a fixing this, but since I've gotten it to work with unicast, there's 
no pressure

Sent from my iPad


On Oct 21, 2014, at 3:15 PM, Digimer li...@alteeve.ca wrote:

Glad you sorted it out!

So then, it was almost certainly a multicast issue. I would still strongly 
recommend trying to source and fix the problem, and reverting to mcast if you 
can. More efficient. :)

digimer


On 21/10/14 02:59 PM, John Scalia wrote:
Ok, got it working after a little more effort, and the cluster is now
properly reporting.


On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com wrote:

So, I set transport=udpi' in the cluster.conf file, and it now looks
like this:

cluster config_version=11 name=pgdb_cluster transport=udpu

fence_daemon/
clusternodes
   clusternode name=csgha1 nodeid=1
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha1/
   /method
 /fence
   /clusternode
   clusternode name=csgha2 nodeid=2
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha2/
   /method
 /fence
   /clusternode
   clusternode name=csgha3 nodeid=3
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha3/
   /method
 /fence
   /clusternode
/clusternodes
cman/
fencedevices
   fencedevice agent=fence_pcmk name=pcmk/
/fencedevices
rm
   failoverdomains/
   resources/
/rm
/cluster

But, after restarting the cluster I don't see any difference. Did I do
something wrong?
--
Jay


On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:

No, you don't need to specify anything in cluster.conf for unicast to
work. Corosync will divine the IPs by resolving the node names to IPs. If
you set multicast and don't want to use the auto-selected mcast IP, then
you can specify the mcast IP group to use via multicast... /.

digimer



On 21/10/14 12:22 PM, John Scalia wrote:

OK, looking at the cman man page on this system, I see the line saying
the corosync.conf file is not used. So, I'm guessing I need to set a
unicast address somewhere in the cluster.conf file, but the man page
only mentions the multicast addr=.../ parameter. What can I use to
set this to a unicast address for ports 5404 and 5405? I'm assuming I
can't just put a unicast address for the multicast parameter, and the
man page for cluster.conf wasn't much help either.

We're still working on having the security team permit these 3 systems
to use multicast.


On 10/21/2014 11:51 AM, Digimer wrote:

Keep us posted. :)


On 21/10/14 08:40 AM, John Scalia wrote:

I've been check hostname resolution this morning, and all the systems
are listed in each /etc/hosts file (No DNS in this environment.) and
ping works on every system both to itself and all the other systems. At
least it's working on the 10.10.1.0/24 network.

I ran tcpdump trying to see what traffic is on port 5405 on each
system,
and I'm only seeing outbound on each, even though netstat shows each is
listening on the multicast address. My suspicion is that the router is
eating the multicast broadcasts, so I may try the unicast address
instead, but I'm waiting on one of our network engineers to see if my
suspicion is correct about the router. He volunteered to help late
yesterday.


On 10/20/2014 4:34 PM, Digimer wrote:

It looks sane on the surface. The 'gethostip' tool comes from the
'syslinux' package, and it's really handy! The '-d' says to give the
IP in dotted-decimanl notation only.

What I was trying to see was whether the 'uname -n' resolved to the IP
on the same network card as the other nodes. This is how corosync
decides which interface to send cluster traffic onto. I suspect you
might have a general network issue, possibly related to multicast.
(Some switches and some hypervisor virtual networks don't play nice
with corosync).

Have you tried unicast? If not, try setting the cman ../ element to
have the cman transport=udpu ... / attribute. Do note that unicast
isn't as efficient as multicast, so thought it might work, I'd
personally treat it as a debug tool to 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-21 Thread Andrew Beekhof

 On 22 Oct 2014, at 9:16 am, Digimer li...@alteeve.ca wrote:
 
 Blocked for me, too. Possible to clone - client data?

Needless paranoia more likely.

This is the original fedora bug (nothing marked private):
   https://bugzilla.redhat.com/show_bug.cgi?id=880035

and the kbase:
   https://access.redhat.com/solutions/784373


 
 On 21/10/14 06:14 PM, jayknowsu...@gmail.com wrote:
 Sure! But i can't seem to get Redhat to let me see the bug, even though I 
 have an account.
 
 Sent from my iPad
 
 On Oct 21, 2014, at 5:51 PM, Andrew Beekhof and...@beekhof.net wrote:
 
 
 On 22 Oct 2014, at 7:36 am, jayknowsu...@gmail.com wrote:
 
 Yep, my network engineer and I found that the multicast packets were being 
 blocked by the underlying hypervisor for the VM systems.
 
 Yeah, that'll happen :-(
 I believe its fixed in newer kernels, but for a while there multicast would 
 appear to work and then stop for no good reason.
 Putting the device into promiscuous mode seemed to help IIRC.
 
 This is the bug I knew it as: 
 https://bugzilla.redhat.com/show_bug.cgi?id=1090670
 
 
 
 At first we thought it was just iptables on the servers, but i was certain 
 I had actually turned that off. The issue has been bumped up to the 
 operations team for a fixing this, but since I've gotten it to work with 
 unicast, there's no pressure
 
 Sent from my iPad
 
 On Oct 21, 2014, at 3:15 PM, Digimer li...@alteeve.ca wrote:
 
 Glad you sorted it out!
 
 So then, it was almost certainly a multicast issue. I would still 
 strongly recommend trying to source and fix the problem, and reverting to 
 mcast if you can. More efficient. :)
 
 digimer
 
 On 21/10/14 02:59 PM, John Scalia wrote:
 Ok, got it working after a little more effort, and the cluster is now
 properly reporting.
 
 On Tue, Oct 21, 2014 at 1:34 PM, John Scalia jayknowsu...@gmail.com 
 wrote:
 
 So, I set transport=udpi' in the cluster.conf file, and it now looks
 like this:
 
 cluster config_version=11 name=pgdb_cluster transport=udpu
 
 fence_daemon/
 clusternodes
   clusternode name=csgha1 nodeid=1
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha1/
   /method
 /fence
   /clusternode
   clusternode name=csgha2 nodeid=2
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha2/
   /method
 /fence
   /clusternode
   clusternode name=csgha3 nodeid=3
 fence
   method name=pcmk-redirect
 device name=pcmk port=csgha3/
   /method
 /fence
   /clusternode
 /clusternodes
 cman/
 fencedevices
   fencedevice agent=fence_pcmk name=pcmk/
 /fencedevices
 rm
   failoverdomains/
   resources/
 /rm
 /cluster
 
 But, after restarting the cluster I don't see any difference. Did I do
 something wrong?
 --
 Jay
 
 On Tue, Oct 21, 2014 at 12:25 PM, Digimer li...@alteeve.ca wrote:
 
 No, you don't need to specify anything in cluster.conf for unicast to
 work. Corosync will divine the IPs by resolving the node names to IPs. 
 If
 you set multicast and don't want to use the auto-selected mcast IP, 
 then
 you can specify the mcast IP group to use via multicast... /.
 
 digimer
 
 
 On 21/10/14 12:22 PM, John Scalia wrote:
 
 OK, looking at the cman man page on this system, I see the line saying
 the corosync.conf file is not used. So, I'm guessing I need to set a
 unicast address somewhere in the cluster.conf file, but the man page
 only mentions the multicast addr=.../ parameter. What can I use to
 set this to a unicast address for ports 5404 and 5405? I'm assuming I
 can't just put a unicast address for the multicast parameter, and the
 man page for cluster.conf wasn't much help either.
 
 We're still working on having the security team permit these 3 systems
 to use multicast.
 
 On 10/21/2014 11:51 AM, Digimer wrote:
 
 Keep us posted. :)
 
 On 21/10/14 08:40 AM, John Scalia wrote:
 
 I've been check hostname resolution this morning, and all the 
 systems
 are listed in each /etc/hosts file (No DNS in this environment.) and
 ping works on every system both to itself and all the other 
 systems. At
 least it's working on the 10.10.1.0/24 network.
 
 I ran tcpdump trying to see what traffic is on port 5405 on each
 system,
 and I'm only seeing outbound on each, even though netstat shows 
 each is
 listening on the multicast address. My suspicion is that the router 
 is
 eating the multicast broadcasts, so I may try the unicast address
 instead, but I'm waiting on one of our network engineers to see if 
 my
 suspicion is correct about the router. He volunteered to help late
 yesterday.
 
 On 10/20/2014 4:34 PM, Digimer wrote:
 
 It looks sane on the surface. The 'gethostip' tool comes from the
 'syslinux' package, and it's really handy! The '-d' says to give 
 the
 IP in dotted-decimanl notation only.
 
 What I was trying to see was whether the 'uname -n' resolved to 
 the IP
 on the same network card as the other nodes. This is how corosync
 decides which interface to send cluster traffic onto. I 

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread Digimer

On 20/10/14 02:50 PM, John Scalia wrote:

Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool nodes
-a, I only see:

Node StsInc  Joined Name
 1  M 64 2014-10--20 14:00:00  csgha1
 Addresses: 10.10.1.128
 2  X 0  csgha2
 3  X 0  csgha3

In the other systems, the output is the same except for which system is
shown as joined. Each shows just itself as belonging to the cluster.
Also, pcs status reflects similarly with non-self systems showing
offline. I've checked netstat -an and see each machine listening on
ports 5405 and 5405. And the logs are rather involved, but I'm not
seeing errors in it.

Any ideas for where to look for what's causing them to not communicate?
--
Jay


Can you share your cluster.conf file please? Also, for each node:

* uname -n
* gethostip -d $(uname -n)
* ifconfig |grep -B 1 $(gethostip -d $(uname -n)) | grep HWaddr | awk '{ 
print $1 }'

* iptables-save | grep -i multi

--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread John Scalia
Sure, and thanks for helping.

Here's the /etc/cluster/cluster.conf file and it is identical on all three
systems:

cluster config_version=11 name=pgdb_cluster
  fence_daemon/
  clusternodes
clusternode name=csgha1 nodeid=1
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha1/
/method
  /fence
/clusternode
clusternode name=csgha2 nodeid=2
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha2/
/method
  /fence
/clusternode
clusternode name=csgha3 nodeid=3
  fence
method name=pcmk-redirect
  device name=pcmk port=csgha3/
/method
  /fence
/clusternode
  /clusternodes
  cman/
  fencedevices
fencedevice agent=fence_pcmk name=pcmk/
  /fencedevices
  rm
failoverdomains/
resources/
  /rm
/cluster

uname -n reports csgha1 on that system, csgha2 on its system, and
csgha3 on the last system.
I don't seem to have gethostip on any of these systems, so I don't know if
the next section helps or not.
ifconfig -a reports csgha1: eth0 = 172.17.1.21
 eth1 = 10.10.1.128
csgha2: eth0 = 10.10.1.129
Yeah, I know this looks a little weird, but it was the way our automated VM
control did the interfaces
 eth1 = 172.,17.1.3
csgha3: eth0 = 172.17.1.23
 eth1 = 10.10.1.130
The /etc/hosts file on each system only has the 10.10.1.0/24 address for
each system in in it.
iptables is not running on these systems.

Let me know if you need more information, and I very much appreciate your
assistance.
--
Jay

On Mon, Oct 20, 2014 at 3:18 PM, Digimer li...@alteeve.ca wrote:

 On 20/10/14 02:50 PM, John Scalia wrote:

 Hi all,

 I'm trying to build my first ever HA cluster and I'm using 3 VMs running
 CentOS 6.5. I followed the instructions to the letter at:

 http://clusterlabs.org/quickstart-redhat.html

 and everything appears to start normally, but if I run cman_tool nodes
 -a, I only see:

 Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00  csgha1
  Addresses: 10.10.1.128
  2  X 0
 csgha2
  3  X 0
 csgha3

 In the other systems, the output is the same except for which system is
 shown as joined. Each shows just itself as belonging to the cluster.
 Also, pcs status reflects similarly with non-self systems showing
 offline. I've checked netstat -an and see each machine listening on
 ports 5405 and 5405. And the logs are rather involved, but I'm not
 seeing errors in it.

 Any ideas for where to look for what's causing them to not communicate?
 --
 Jay


 Can you share your cluster.conf file please? Also, for each node:

 * uname -n
 * gethostip -d $(uname -n)
 * ifconfig |grep -B 1 $(gethostip -d $(uname -n)) | grep HWaddr | awk '{
 print $1 }'
 * iptables-save | grep -i multi

 --
 Digimer
 Papers and Projects: https://alteeve.ca/w/
 What if the cure for cancer is trapped in the mind of a person without
 access to education?
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread Maciej Rostański
Hello,

In my experience such problems were the effect of my mistakes, such as not
having all hosts in /etc/hosts file. Check this, please, I know it sounds
simple.

Also, commands:
pcs cluster setup --name clustername node1 node2 node3
pcs cluster enable
pcs cluster start

are much more pleasant to run than ccs method you use, and they work on
Centos6.5

Regards,
Maciej



2014-10-20 20:50 GMT+02:00 John Scalia jayknowsu...@gmail.com:

 Hi all,

 I'm trying to build my first ever HA cluster and I'm using 3 VMs running
 CentOS 6.5. I followed the instructions to the letter at:

 http://clusterlabs.org/quickstart-redhat.html

 and everything appears to start normally, but if I run cman_tool nodes
 -a, I only see:

 Node StsInc  Joined Name
 1  M 64 2014-10--20 14:00:00  csgha1
 Addresses: 10.10.1.128
 2  X 0  csgha2
 3  X 0  csgha3

 In the other systems, the output is the same except for which system is
 shown as joined. Each shows just itself as belonging to the cluster. Also,
 pcs status reflects similarly with non-self systems showing offline. I've
 checked netstat -an and see each machine listening on ports 5405 and
 5405. And the logs are rather involved, but I'm not seeing errors in it.

 Any ideas for where to look for what's causing them to not communicate?
 --
 Jay
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems




-- 
Maciej Rostanski
mrostan...@gmail.com
http://mrdean.wordpress.com
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread John Scalia
Thanks, but on centOS are you saying to use pcs cluster start rather than using service cman start and service pacemaker start? I was just going by the tutorial, which 
doesn't mention this.


On 10/20/2014 3:44 PM, Maciej Rostański wrote:

Hello,

In my experience such problems were the effect of my mistakes, such as not
having all hosts in /etc/hosts file. Check this, please, I know it sounds
simple.

Also, commands:
pcs cluster setup --name clustername node1 node2 node3
pcs cluster enable
pcs cluster start

are much more pleasant to run than ccs method you use, and they work on
Centos6.5

Regards,
Maciej



2014-10-20 20:50 GMT+02:00 John Scalia jayknowsu...@gmail.com:


Hi all,

I'm trying to build my first ever HA cluster and I'm using 3 VMs running
CentOS 6.5. I followed the instructions to the letter at:

http://clusterlabs.org/quickstart-redhat.html

and everything appears to start normally, but if I run cman_tool nodes
-a, I only see:

Node StsInc  Joined Name
 1  M 64 2014-10--20 14:00:00  csgha1
 Addresses: 10.10.1.128
 2  X 0  csgha2
 3  X 0  csgha3

In the other systems, the output is the same except for which system is
shown as joined. Each shows just itself as belonging to the cluster. Also,
pcs status reflects similarly with non-self systems showing offline. I've
checked netstat -an and see each machine listening on ports 5405 and
5405. And the logs are rather involved, but I'm not seeing errors in it.

Any ideas for where to look for what's causing them to not communicate?
--
Jay
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems






___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread Maciej Rostański
Well, with 6.4 and 6.5 (which I like a lot) there is this specific
situation - no more crm, only pcs and ccs, but on the other hand, stack
with cman (which is being replaced by corosync 2.0 now). So the
documentation found on various sites is rarely handy...

2014-10-20 22:17 GMT+02:00 John Scalia jayknowsu...@gmail.com:

 Thanks, but on centOS are you saying to use pcs cluster start rather
 than using service cman start and service pacemaker start? I was just
 going by the tutorial, which doesn't mention this.


 On 10/20/2014 3:44 PM, Maciej Rostański wrote:

 Hello,

 In my experience such problems were the effect of my mistakes, such as not
 having all hosts in /etc/hosts file. Check this, please, I know it sounds
 simple.

 Also, commands:
 pcs cluster setup --name clustername node1 node2 node3
 pcs cluster enable
 pcs cluster start

 are much more pleasant to run than ccs method you use, and they work on
 Centos6.5

 Regards,
 Maciej



 2014-10-20 20:50 GMT+02:00 John Scalia jayknowsu...@gmail.com:

  Hi all,

 I'm trying to build my first ever HA cluster and I'm using 3 VMs running
 CentOS 6.5. I followed the instructions to the letter at:

 http://clusterlabs.org/quickstart-redhat.html

 and everything appears to start normally, but if I run cman_tool nodes
 -a, I only see:

 Node StsInc  Joined Name
  1  M 64 2014-10--20 14:00:00  csgha1
  Addresses: 10.10.1.128
  2  X 0
 csgha2
  3  X 0
 csgha3

 In the other systems, the output is the same except for which system is
 shown as joined. Each shows just itself as belonging to the cluster.
 Also,
 pcs status reflects similarly with non-self systems showing offline.
 I've
 checked netstat -an and see each machine listening on ports 5405 and
 5405. And the logs are rather involved, but I'm not seeing errors in it.

 Any ideas for where to look for what's causing them to not communicate?
 --
 Jay
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems




 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems




-- 
Maciej Rostanski
mrostan...@gmail.com
http://mrdean.wordpress.com
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread Andrew Beekhof

 On 21 Oct 2014, at 7:17 am, John Scalia jayknowsu...@gmail.com wrote:
 
 Thanks, but on centOS are you saying to use pcs cluster start rather than 
 using service cman start and service pacemaker start? I was just going by 
 the tutorial, which doesn't mention this.

'service pacemaker start' and 'pcs cluster start' are pretty much equivalent.
both will start cman if its not running already

 
 On 10/20/2014 3:44 PM, Maciej Rostański wrote:
 Hello,
 
 In my experience such problems were the effect of my mistakes, such as not
 having all hosts in /etc/hosts file. Check this, please, I know it sounds
 simple.
 
 Also, commands:
 pcs cluster setup --name clustername node1 node2 node3
 pcs cluster enable
 pcs cluster start
 
 are much more pleasant to run than ccs method you use, and they work on
 Centos6.5
 
 Regards,
 Maciej
 
 
 
 2014-10-20 20:50 GMT+02:00 John Scalia jayknowsu...@gmail.com:
 
 Hi all,
 
 I'm trying to build my first ever HA cluster and I'm using 3 VMs running
 CentOS 6.5. I followed the instructions to the letter at:
 
 http://clusterlabs.org/quickstart-redhat.html
 
 and everything appears to start normally, but if I run cman_tool nodes
 -a, I only see:
 
 Node StsInc  Joined Name
 1  M 64 2014-10--20 14:00:00  csgha1
 Addresses: 10.10.1.128
 2  X 0  csgha2
 3  X 0  csgha3
 
 In the other systems, the output is the same except for which system is
 shown as joined. Each shows just itself as belonging to the cluster. Also,
 pcs status reflects similarly with non-self systems showing offline. I've
 checked netstat -an and see each machine listening on ports 5405 and
 5405. And the logs are rather involved, but I'm not seeing errors in it.
 
 Any ideas for where to look for what's causing them to not communicate?
 --
 Jay
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] New user can't get cman to recognize other systems

2014-10-20 Thread jayknowsunix
OK, got it.

Sent from my iPad

 On Oct 20, 2014, at 10:10 PM, Andrew Beekhof and...@beekhof.net wrote:
 
 
 On 21 Oct 2014, at 7:17 am, John Scalia jayknowsu...@gmail.com wrote:
 
 Thanks, but on centOS are you saying to use pcs cluster start rather than 
 using service cman start and service pacemaker start? I was just going 
 by the tutorial, which doesn't mention this.
 
 'service pacemaker start' and 'pcs cluster start' are pretty much equivalent.
 both will start cman if its not running already
 
 
 On 10/20/2014 3:44 PM, Maciej Rostański wrote:
 Hello,
 
 In my experience such problems were the effect of my mistakes, such as not
 having all hosts in /etc/hosts file. Check this, please, I know it sounds
 simple.
 
 Also, commands:
 pcs cluster setup --name clustername node1 node2 node3
 pcs cluster enable
 pcs cluster start
 
 are much more pleasant to run than ccs method you use, and they work on
 Centos6.5
 
 Regards,
 Maciej
 
 
 
 2014-10-20 20:50 GMT+02:00 John Scalia jayknowsu...@gmail.com:
 
 Hi all,
 
 I'm trying to build my first ever HA cluster and I'm using 3 VMs running
 CentOS 6.5. I followed the instructions to the letter at:
 
 http://clusterlabs.org/quickstart-redhat.html
 
 and everything appears to start normally, but if I run cman_tool nodes
 -a, I only see:
 
 Node StsInc  Joined Name
1  M 64 2014-10--20 14:00:00  csgha1
Addresses: 10.10.1.128
2  X 0  csgha2
3  X 0  csgha3
 
 In the other systems, the output is the same except for which system is
 shown as joined. Each shows just itself as belonging to the cluster. Also,
 pcs status reflects similarly with non-self systems showing offline. I've
 checked netstat -an and see each machine listening on ports 5405 and
 5405. And the logs are rather involved, but I'm not seeing errors in it.
 
 Any ideas for where to look for what's causing them to not communicate?
 --
 Jay
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 
 
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems