[Linux-HA] Some questions about High Availability

2012-11-30 Thread Rodrigo Abrantes Antunes
Hi, I'm new to heartbeat, just started to read about it a week ago. I'm
using ubuntu, I installed heartbeat 2.1.4 in my two nodes. Here is ha.cf of
them:

  logfacility local0
  keepalive 2
  deadtime 5
  udpport 694
  bcast   eth5
  auto_failback on
  node    node1
  node    node2

  And here is haresources (I'm testing only apache, the ip address don't
matter):
  node1 apache2

  I stoped heartbeat service in node1 and then apache started in node2 ,
then I started heartbeat service in node1 and apache started in node1 but
stayed active in node2. Shouldn't the one in node2 be stopped? Another
question: If I stop apache service in one node it don't start in the other
node, shouldn't it be started since apache is down? How heartbeat monitors
the process to see if it is down or up? And another question: I'm used with
VMware HA for High Availability between VMS, it saves a copy of the vm and
start it in case of the main one goes down, the vm's data are syncronized
between each other. My objective when I started to read about heartbeat was
to achieve this same goal but with physical machines, but with some
research I realised that heartbeat don't do this alone, I need another
software to syncronize data between the nodes. What I want to ask is wich
one is the best to syncronize ALL data between the 2 nodes including data
that may be in use, data related to hardware (my two nodes have the same
hardware), in a way that one node is the mirror of the other just like in
vmware HA. Or maybe heartbeat isn't the right solution for this goal? I've
heard about LVS for ha and load balancing and Linux HPC and I was thinking
about some hardware stuff too. One more question: I've read somewhere that
you can have a heartbeat server used to manage the multiple pairs of
nodes, is this right? Is there a gui for this like a web interface?

  Thanks.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Some questions about High Availability

2012-11-30 Thread Digimer
The heartbeat package is deprecated in favour of corosync + pacemaker.
There is no planned future development of heartbeat. I'd recommend
reading the Clusters From Scratch tutorials available on the pacemaker
website.

On 11/30/2012 02:08 PM, Rodrigo Abrantes Antunes wrote:
 Hi, I'm new to heartbeat, just started to read about it a week ago. I'm
 using ubuntu, I installed heartbeat 2.1.4 in my two nodes. Here is ha.cf of
 them:
 
   logfacility local0
   keepalive 2
   deadtime 5
   udpport 694
   bcast   eth5
   auto_failback on
   nodenode1
   nodenode2
 
   And here is haresources (I'm testing only apache, the ip address don't
 matter):
   node1 apache2
 
   I stoped heartbeat service in node1 and then apache started in node2 ,
 then I started heartbeat service in node1 and apache started in node1 but
 stayed active in node2. Shouldn't the one in node2 be stopped? Another
 question: If I stop apache service in one node it don't start in the other
 node, shouldn't it be started since apache is down? How heartbeat monitors
 the process to see if it is down or up? And another question: I'm used with
 VMware HA for High Availability between VMS, it saves a copy of the vm and
 start it in case of the main one goes down, the vm's data are syncronized
 between each other. My objective when I started to read about heartbeat was
 to achieve this same goal but with physical machines, but with some
 research I realised that heartbeat don't do this alone, I need another
 software to syncronize data between the nodes. What I want to ask is wich
 one is the best to syncronize ALL data between the 2 nodes including data
 that may be in use, data related to hardware (my two nodes have the same
 hardware), in a way that one node is the mirror of the other just like in
 vmware HA. Or maybe heartbeat isn't the right solution for this goal? I've
 heard about LVS for ha and load balancing and Linux HPC and I was thinking
 about some hardware stuff too. One more question: I've read somewhere that
 you can have a heartbeat server used to manage the multiple pairs of
 nodes, is this right? Is there a gui for this like a web interface?
 
   Thanks.
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Some questions about High Availability

2012-11-30 Thread Dimitri Maziuk
On 11/30/2012 01:08 PM, Rodrigo Abrantes Antunes wrote:
 Hi, I'm new to heartbeat, just started to read about it a week ago. I'm
 using ubuntu, I installed heartbeat 2.1.4 in my two nodes. Here is ha.cf of
 them:
 
   logfacility local0
   keepalive 2
   deadtime 5
   udpport 694
   bcast   eth5
   auto_failback on
   nodenode1
   nodenode2

I think you may need crm no as well.

What's your eth5 connected to?

   And here is haresources (I'm testing only apache, the ip address don't
 matter):
   node1 apache2
 
   I stoped heartbeat service in node1 and then apache started in node2 ,
 then I started heartbeat service in node1 and apache started in node1 but
 stayed active in node2. Shouldn't the one in node2 be stopped?

Yes it should be. Read the logs (tail -f on one console while stopping
heartbeat on another).

 Another
 question: If I stop apache service in one node it don't start in the other
 node, shouldn't it be started since apache is down? How heartbeat monitors
 the process to see if it is down or up?

No because in your configuration heartbeat doesn't monitor services.
Install mon and write a custom alert script that does
 /usr/share/heartbeat/hb_standby all
when process dies.

 What I want to ask is wich
 one is the best to syncronize ALL data between the 2 nodes including data
 that may be in use

Simple setup: put it all on one filesystem and put that on DRBD. See
http://www.drbd.org/users-guide-8.3/s-heartbeat-r1.html

Also recommended: drbdlinks.

HTH
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Some help on understanding how HA issues are addressed by pacemaker

2012-11-30 Thread Hermes Flying
Hi,
I am looking into using your facilities to have high availability on my system. 
I am trying to figure out some things. I hope you guys could help me.
I am interested in knowing how pacemaker migrates a VIP and how a splitbrain 
situation is address by your facilities.
To be specific: I am interested in the following setup:

2 linux machines. Each machine runs a load balancer and a Tomcat instance.
If I understand correctly pacemaker will be responsible to assign the main VIP 
to one of the nodes.

My questions are:
1)Will pacemaker monitor/restart the load balancers on each machine in case of 
crash?
2) How does pacemaker decide to migrate the VIP to the other node?
3) Do the pacemakers in each machine communicate? If yes how do you handle 
network failure? Could I end up with split-brain?
4) Generally how is split-brain addressed using pacemaker? 
5) Could pacemaker monitor Tomcat?

As you can see I am interested in maintain quorum in a two-node configuration. 
If you can help me with this info to find a proper direction it would be much 
appreciated!

Thank you
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Some help on understanding how HA issues are addressed by pacemaker

2012-11-30 Thread David Vossel
- Original Message -
 From: Hermes Flying flyingher...@yahoo.com
 To: linux-ha@lists.linux-ha.org
 Sent: Friday, November 30, 2012 4:04:34 PM
 Subject: [Linux-HA] Some help on understanding how HA issues are addressed
 by pacemaker
 
 Hi,
 I am looking into using your facilities to have high availability on
 my system. I am trying to figure out some things. I hope you guys
 could help me.
 I am interested in knowing how pacemaker migrates a VIP and how a
 splitbrain situation is address by your facilities.
 To be specific: I am interested in the following setup:
 
 2 linux machines. Each machine runs a load balancer and a Tomcat
 instance.
 If I understand correctly pacemaker will be responsible to assign the
 main VIP to one of the nodes.
 
 My questions are:
 1)Will pacemaker monitor/restart the load balancers on each machine
 in case of crash?
 2) How does pacemaker decide to migrate the VIP to the other node?
 3) Do the pacemakers in each machine communicate? If yes how do you
 handle network failure? Could I end up with split-brain?
 4) Generally how is split-brain addressed using pacemaker?
 5) Could pacemaker monitor Tomcat?
 
 As you can see I am interested in maintain quorum in a two-node
 configuration. If you can help me with this info to find a proper
 direction it would be much appreciated!
 
 Thank you


Hey, 

You may or may not have looked at this already, but this is a good place to 
start, http://clusterlabs.org/doc/

Read chapter one of this document.
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html

Running through this 2 node cluster exercise will likely answer many of your 
questions.
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Clusters_from_Scratch/index.html


-- Vossel
 ___
 Linux-HA mailing list
 Linux-HA@lists.linux-ha.org
 http://lists.linux-ha.org/mailman/listinfo/linux-ha
 See also: http://linux-ha.org/ReportingProblems
 
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Some help on understanding how HA issues are addressed by pacemaker

2012-11-30 Thread Digimer
On 11/30/2012 05:04 PM, Hermes Flying wrote:
 Hi,
 I am looking into using your facilities to have high availability on my 
 system. I am trying to figure out some things. I hope you guys could help me.
 I am interested in knowing how pacemaker migrates a VIP and how a splitbrain 
 situation is address by your facilities.
 To be specific: I am interested in the following setup:
 
 2 linux machines. Each machine runs a load balancer and a Tomcat instance.
 If I understand correctly pacemaker will be responsible to assign the main 
 VIP to one of the nodes.

Pacemaker can handle virtual IP addresses, but it's more of a fail-over
system, rather than a load-balancing, round-robin system. For load
balancing, look at Red Hat's LVS.

 My questions are:
 1)Will pacemaker monitor/restart the load balancers on each machine in case 
 of crash?

It can monitor/recover/relocate any service that uses init.d style
scripts. If a script/service responds properly to stop, status and
start, you're good to go.

 2) How does pacemaker decide to migrate the VIP to the other node?

At the most simple; When the machine hosting the VIP fails, it will
relocate. You can control how, when and where the VIP fails back (look
at 'resource stickiness').

 3) Do the pacemakers in each machine communicate? If yes how do you handle 
 network failure? Could I end up with split-brain?

Pacemaker uses corosync for cluster membership, quorum and fencing. A
properly configured fence device (aka stonith), will prevent a split
brain. If you disable or fail to properly setup fencing, split brains
are possible and even likely.

 4) Generally how is split-brain addressed using pacemaker? 

Fencing to prevent it.

 5) Could pacemaker monitor Tomcat?

If it supports stop, start and status, yes.

 As you can see I am interested in maintain quorum in a two-node 
 configuration. If you can help me with this info to find a proper direction 
 it would be much appreciated!

Quorum needs to be disabled in a two-node cluster. This is fine with
good fencing.

To learn more, please see the documentation available here:


http://clusterlabs.org/doc/
-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] master/slave drbd resource STILL will not failover

2012-11-30 Thread Vladislav Bogdanov
30.11.2012 00:14, Robinson, Eric wrote:
 Bump... does anyone have some insight on this? Google is not turning up 
 anything useful.
 
 Our newest cluster will not failover master/slave drbd resources. It works 
 fine manually using drbdadm from a shell prompt, but when we try it using 
 'crm node standby' and letting the cluster manage the resource, crm_mon just 
 keeps saying the resource FAILED.
 
 We see a lot of these messages in the corosync.log file:
 
 drbd(p_drbd1)[12814]:   2012/11/27_15:31:59 DEBUG: ha02_mysql: Calling 
 drbdadm -c /etc/drbd.conf primary ha02_mysql
 drbd(p_drbd1)[12814]:   2012/11/27_15:31:59 ERROR: ha02_mysql: Called drbdadm 
 -c /etc/drbd.conf primary ha02_mysql
 drbd(p_drbd1)[12814]:   2012/11/27_15:31:59 ERROR: ha02_mysql: Exit code 11
 
 There is no indication of what may be causing the 'Exit code 11'
 
 Here is a link to the corosync log, taken from the standby server (ha09a) 
 where we are trying to fail the resource to...
 
 www.psmnv.com/downloads/corosync1.loghttp://www.psmnv.com/downloads/corosync1.log
 
 Here is what I have installed...
 
 corosync-1.4.1-7.el6_3.1.x86_64
 corosynclib-1.4.1-7.el6_3.1.x86_64
 pacemaker-1.1.8-4.el6.x86_64
 pacemaker-cli-1.1.8-4.el6.x86_64
 pacemaker-cluster-libs-1.1.8-4.el6.x86_64
 pacemaker-libs-1.1.8-4.el6.x86_64
 
 Following is my crm config. It's pretty basic.
 
 
 node ha09a \
 attributes standby=off
 node ha09b \
 attributes standby=off
 primitive p_drbd0 ocf:linbit:drbd \
 params drbd_resource=ha01_mysql \
 op monitor interval=60s
 primitive p_drbd1 ocf:linbit:drbd \
 params drbd_resource=ha02_mysql \
 op monitor interval=45s
 primitive p_vip_clust08 ocf:heartbeat:IPaddr2 \
 params ip=192.168.10.210 cidr_netmask=32 \
 op monitor interval=30s
 primitive p_vip_clust09 ocf:heartbeat:IPaddr2 \
 params ip=192.168.10.211 cidr_netmask=32 \
 op monitor interval=30s
 ms ms_drbd0 p_drbd0 \
 meta master-max=1 master-node-max=1 clone-max=2 
 clone-node-max=1 notify=true target-role=Master
 ms ms_drbd1 p_drbd1 \
 meta master-max=1 master-node-max=1 clone-max=2 
 clone-node-max=1 notify=true target-role=Master

Try to set 'target-role=Started' in both of them.

Vladislav

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems