[Linux-HA] drbd heartbeat v2

2008-02-19 Thread Damon Estep
On this page: http://www.linux-ha.org/DRBD/HowTov2 is this comment: "drbd must not be started by init" I find that it only works correctly if drbd is started by init. I have tried both 0.7.25 and 8.0.10 for DRBD on heartbeat 2.1.3 with crm=yes and everything set up as outlined on the page.

RE: [Linux-HA] RHEL4 What RPM package should I use.

2008-02-19 Thread Stephan Berlet
> I use the packages from Centos4 on my RHEL AS 4.6 servers: > > http://mirror.centos.org/centos/4/extras/i386/RPMS/ > or > http://mirror.centos.org/centos/4/extras/x86_64/RPMS/ > > I'd prefer to use > > http://download.opensuse.org/repositories/server:/ha-clustering > > because that's where th

AW: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Schmidt, Florian
It is possible to start DRBD with Heartbeat, better with Pacemaker (which is now the new name of the heartbeat cluster-resource manager etc...) Then you have to use the resource agent "drbd". I think you use the resource agent drbddisk at the momentthis script doesn't start drbd, it only h

Re: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Dominik Klein
Damon Estep wrote: On this page: http://www.linux-ha.org/DRBD/HowTov2 is this comment: "drbd must not be started by init" Well, you do not have to start drbd by init. But it shouldn't harm if you do. This statement is false if you want to use the heartbeat Resource Agent drbddisk, but that'

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Michael Brennen
On Sun, 17 Feb 2008, Michael Brennen wrote: Heartbeat 2.1.3, crm enabled I've built an initial drbd master/slave on two systems, lvc7 and lvc8, following http://www.linux-ha.org/DRBD/HowTov2. The drbd is coming alive in P/S mode, but it will not fail over when I kill the master; the slave s

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Dominik Klein
The stonith daemons start successfully now, but with a monitor interval of 15s one of the two fails fairly quickly. The apc (9211 masterswitch) only allows a single login, and I wonder if the two daemons aren't colliding, and one is timing out and giving up. Did you try apcmastersnmp? Don't

Re: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Dominik Klein
I am using the ofc drbd resource. It is NOT starting drbd but will manage master/slave. If I turn off drbd startup (chkconfig drbd off) on both odes and then pull the power on the active node I get a clean failover, but the slave drbd resource refuses to start on the inactive node whne it come

RE: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Damon Estep
> > It is possible to start DRBD with Heartbeat, better with Pacemaker > (which is now the new name of the heartbeat cluster-resource manager > etc...) > > Then you have to use the resource agent "drbd". [Damon Estep] I am using the ofc drbd resource. It is NOT starting drbd but will manage ma

[Linux-HA] heartbeat issue with lots of error messages

2008-02-19 Thread Johan Huysmans
HI all, Some days ago we had an issue with heartbeat. It started when the active node was rebooted. The cluster contains DRBD, an IP and some services. There is a bonded interfaces on both hosts and a serial cable between the 2 hosts. When the active machine was rebooted the resources are t

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Michael Brennen
On Tue, 19 Feb 2008, Dominik Klein wrote: The stonith daemons start successfully now, but with a monitor interval of 15s one of the two fails fairly quickly. The apc (9211 masterswitch) only allows a single login, and I wonder if the two daemons aren't colliding, and one is timing out and

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 03:38:46AM -0600, Michael Brennen wrote: > On Sun, 17 Feb 2008, Michael Brennen wrote: > >> Heartbeat 2.1.3, crm enabled >> >> I've built an initial drbd master/slave on two systems, lvc7 and lvc8, >> following http://www.linux-ha.org/DRBD/HowTov2. The drbd is coming

Re: [Linux-HA] OCFS2 on HB 2.1.3 v2

2008-02-19 Thread Raoul Bhatia [IPAX]
Michael Brennen wrote: Can someone give pointers to integrating ocfs2 with heartbeat? The idea is to run ocfs2 as the cluster file system on the real servers running on an iscsi failover backend cluster. Apparently some userspace patches are required to ocfs2 to let hb manage it, but I think

[Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Abraham Iglesias
Hi all, I have configurede a 2 nodes v2 HA cluster with hearbeat 2.0.8. So far, I included all resources in the same group. It is an easy way to offer colocation and ordering features. The problem is that I have 8 tomcat instances within the same group, so in a loaded environment it takes 3 m

Re: [Linux-HA] OCFS2 on HB 2.1.3 v2

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T12:33:04, "Raoul Bhatia [IPAX]" <[EMAIL PROTECTED]> wrote: > to my knowledge, the folks from suse made some heavy modification > to ocfs2 to remove this behavior. i once tried to incorporate their > patches into the then current vanilla kernel, but failed mainly because > of my lack

Re: AW: [Linux-HA] is ldirectord ready for Hearbeat v2 style cluster?

2008-02-19 Thread Abraham Iglesias
Amazing Thank you very much for the tutorial. I will try that configuration :) -Abraham Stephan Berlet escribió: -Ursprüngliche Nachricht- Von: Eddie C [mailto:[EMAIL PROTECTED] Gesendet: Freitag, 15. Februar 2008 21:21 An: General Linux-HA mailing list Betreff: Re: [Linux-HA] is

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 12:40:15PM +0100, Abraham Iglesias wrote: > Hi all, > I have configurede a 2 nodes v2 HA cluster with hearbeat 2.0.8. So far, I > included all resources in the same group. It is an easy way to offer > colocation and ordering features. > > The problem is that I have 8

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Marco Leone
Hi, I'm using drbd 8.2.4 and heartbeat v.2 too on two ubuntu 7.04 server nodes. I followed this link http://linux-ha.org/DRBD/HowTov2 and configured a VIP, apache2 and drbd resource on my cib.xml and everything is working fine; no problem in drbd switching master/slave with a stop to heartbeat

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Dominik Klein
Marco Leone wrote: Hi, I'm using drbd 8.2.4 and heartbeat v.2 too on two ubuntu 7.04 server nodes. I guess you did not completey do that. I followed this link http://linux-ha.org/DRBD/HowTov2 id="prefered_location_group_1_expr" operation="eq" value="ub704h

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T12:11:26, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > there ordered is set to false. I have the group running, and when I then > e.g. want to stop the resource D2, then D3 stops too. Only when I change > collocated to false, then D3 keeps running when I stop D2. > > Seems to

[Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Sebastian Reitenbach
Hi, as far as I understand groups, the parameter ordered means, when set to yes, that the resources in the group are started and stopped in the order that they appear in the CIB. The collocated parameter means, that when set to yes, all resources in a group run on the same cluster node. I just

Re: [Linux-HA] lrmd stuck and OCF agent turn defunct

2008-02-19 Thread Franck Ganachaud
well, "service heartbeat stop" will do the job actually. Dejan Muhamedagic a écrit : Hi, On Wed, Jan 09, 2008 at 03:04:05PM +0100, Franck Ganachaud wrote: I followed your advice and installed 2.1.2 from CentOS 4 as I need binary package. You'll be better off with 2.1.3. It fixes some

[Linux-HA] Collocation doesn't work [Config I guess]

2008-02-19 Thread Franck Ganachaud
Well, On a 2 nodes cluster using heartbeat 2.1.2, I have one cloned ressource and 1 group executed preferably on nodeA but if clone if failed or off, group should be migrated to nodeB When I stop the service, cloned ressource mysql_orb fails on nodeA but I don't know what to do to make the

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Dave Blaschke
Dejan Muhamedagic wrote: Hi, On Tue, Feb 19, 2008 at 03:38:46AM -0600, Michael Brennen wrote: On Sun, 17 Feb 2008, Michael Brennen wrote: Heartbeat 2.1.3, crm enabled I've built an initial drbd master/slave on two systems, lvc7 and lvc8, following http://www.linux-ha.org/DRBD/HowTov

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Sebastian Reitenbach
Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote: > On 2008-02-19T12:11:26, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > > > there ordered is set to false. I have the group running, and when I then > > e.g. want to stop the resource D2, then D3 stops too. Only when I change > > collocated to

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Marco Leone
I guess you did not completey do that. [config file] Otherwise you'd have the master colocation constraint: from="fs0" score="infinity"/> Well I did cut and paste the constraints from the website and put them into a file but adding it with cibadmin returned me an error (as I explai

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Sebastian Reitenbach
Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote: > On 2008-02-19T15:49:28, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > > > > Make rsc 'from' run on the same machine as rsc 'to' > > > > > > If rsc 'to' cannot run anywhere and 'score' is INFINITY, > > > then rsc 'from' wont be allowed to run any

Re: [Linux-HA] question regarding orderings in resource groups

2008-02-19 Thread Lars Marowsky-Bree
On 2008-02-19T15:49:28, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > > Make rsc 'from' run on the same machine as rsc 'to' > > > > If rsc 'to' cannot run anywhere and 'score' is INFINITY, > > then rsc 'from' wont be allowed to run anywhere either > > If rsc 'from' cannot run anywhere, the

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Abraham Iglesias
I think there wouldn't be any problem in upgrading heartbeat. Would it be my 2.0.8 configuration compatible? With an unordered group, only 1 resource within the group would be restarted? The problem is that i need to have mounted a drbd partition before tomcat starts. so... In some way, i ne

Re: [Linux-HA] RE: Re: About pgsql RA.

2008-02-19 Thread Serge Dubrouski
HIDEO - Please give a try to the attached patch and let me know what you think. I put back the old mechanism of checking process status without using pgrep and also cleaned a couple of other things. Thanks. Serge. On Feb 15, 2008 4:28 PM, Serge Dubrouski <[EMAIL PROTECTED]> wrote: > On Thu, Feb

[Linux-HA] Split Brain and not able to repair

2008-02-19 Thread Schmidt, Florian
Hi readers, i caused a split brain on my testing machine, to see how it would react. I disabled on both machines the eth1-interface, over which the heartbeat happened. So the DRBD still was connected (over the eth0-interface) but, hearbeat was split-brained. After I saw, what I expected (heartbe

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Doug Lochart
Check /var/log/messages for split brain. This is what happened to me. You will need to investigate setting up STONITH. regards, Doug On Feb 19, 2008 11:43 AM, Christian Rishøj <[EMAIL PROTECTED]> wrote: > On Feb 19, 2008 4:21 PM, Marco Leone <[EMAIL PROTECTED]> wrote: > > > > > I guess you did

Re: [Linux-HA] Split Brain and not able to repair

2008-02-19 Thread Doug Lochart
I feel your pain. I suffered through this as well as I am just learning. I was following a few tutorials and followed them closely only to end up in SplitBrain (WTF??) so I plan on writing a tutorial that covers what all of what is needed to avoid this situation. However I am still struggling to

Re: [Linux-HA] drbd heartbeat v2 working (problem with fs0)

2008-02-19 Thread Christian Rishøj
On Feb 19, 2008 4:21 PM, Marco Leone <[EMAIL PROTECTED]> wrote: > > > I guess you did not completey do that. > [config file] > > Otherwise you'd have the master colocation constraint: > > > > > from="fs0" score="infinity"/> > > Well I did cut and paste the constraints from the website and p

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 04:24:19PM +0100, Abraham Iglesias wrote: > I think there wouldn't be any problem in upgrading heartbeat. Would it be > my 2.0.8 configuration compatible? It should with the exception, perhaps, of the crm_config where in all options underscores are replaced by dashes.

Re: [Linux-HA] Re: OCFS2 on HB 2.1.3 v2

2008-02-19 Thread Christian Rishøj
On Feb 18, 2008 7:33 PM, Eddie C <[EMAIL PROTECTED]> wrote: > The reason I began researching this type of replication is that we had > several applications that could not be easily clustered because they > had some persistant data such as logs and state information that was > stored locally on disk

[Linux-HA] Apache running on multiple nodes

2008-02-19 Thread Jason Erickson
I am trying to set up Heartbeat with an virtual Ip, NFS share and an Apache web server. The virtual Ip and NFS share work but for some reason the Apache server wants to run on 2 nodes and will not start. It comes up as being unmanaged. I configured collocation and order so they would be on the

RE: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Damon Estep
Here is the output from 'service drbd status' on the node where the slave does not start. It appears that it is started but "unconfigured" # service drbd status drbd driver loaded OK; device status: version: 0.7.25 (api:79/proto:74) GIT-hash: 3a9c7c136a9af8df921b3628129dafbe212ace9f build by [EMAI

[Linux-HA] MySQL OCF master slave

2008-02-19 Thread Adrian Chapela
Hello, I am adapting original script to a Master/Slave "homemade" script. I have a doubt. I have the script with some options: start, stop, status, monitor, promote, demote. After I start mysql and I want to do a promote, when I promoted the server I must return OCF_RUNNING_MASTER or I must

[Linux-HA] Moving a resource within a group using CLI

2008-02-19 Thread Prakash Velayutham
Hello All, I was wondering if anyone knows the command to use for moving a resource up or down within a group? This same can be accomplished using the "up"/"down" arrow in the hb_gui application (even though it did not work most of the times for me for some reason). Thanks, Prakash

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Abraham Iglesias
Hi Deja, I updated to 2.1.3-3 and created a new configuration with an unordered group. Resources within a group are restarted when one of them fails! :( The group resources are collocated but unordered. I guess this does not work for me... any advice? -Abraham Dejan Muhamedagic escribió:

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 06:53:51PM +0100, Abraham Iglesias wrote: > Hi Deja, > > I updated to 2.1.3-3 and created a new configuration with an unordered > group. Resources within a group are restarted when one of them fails! :( > The group resources are collocated but unordered. The resource

Re: [Linux-HA] Apache running on multiple nodes

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 12:12:24PM -0500, Jason Erickson wrote: > I am trying to set up Heartbeat with an virtual Ip, NFS share and an Apache > web server. The virtual Ip and NFS share work but for some reason the > Apache server wants to run on 2 nodes and will not start. It comes up as >

Re: [Linux-HA] MySQL OCF master slave

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 04:27:06PM +0100, Adrian Chapela wrote: > Hello, > > I am adapting original script to a Master/Slave "homemade" script. I have a > doubt. I have the script with some options: start, stop, status, monitor, > promote, demote. > > After I start mysql and I want to do a p

Re: [Linux-HA] Split Brain and not able to repair

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 05:33:39PM +0100, Schmidt, Florian wrote: > Hi readers, > > i caused a split brain on my testing machine, to see how it would react. > I disabled on both machines the eth1-interface, over which the heartbeat > happened. > > So the DRBD still was connected (over the et

Re: [Linux-HA] Split Brain and not able to repair

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 12:07:27PM -0500, Doug Lochart wrote: > I feel your pain. I suffered through this as well as I am just > learning. I was following a few tutorials and followed them closely > only to end up in SplitBrain (WTF??) so I plan on writing a tutorial > that covers what all o

Re: [Linux-HA] Moving a resource within a group using CLI

2008-02-19 Thread Dejan Muhamedagic
Hi, On Tue, Feb 19, 2008 at 01:31:34PM -0500, Prakash Velayutham wrote: > Hello All, > > I was wondering if anyone knows the command to use for moving a resource up > or down within a group? There's no such command. You'll have to use cibadmin to dump xml to a file (or crm_resource -x), edit it,

Re: [Linux-HA] Quorum server problem

2008-02-19 Thread Dejan Muhamedagic
Hi, On Mon, Feb 11, 2008 at 06:42:52PM +0200, Atanas Dyulgerov wrote: > Hi all, > > I have 5 node cluster. I've setup the cluster to stop the > resources on each subcluster then machine does not have quorum. > However I have 2 of my 5 nodes in geographically separated > site. I need to find a way

Re: [Linux-HA] General hb_gui and v1 / v2 question

2008-02-19 Thread Dejan Muhamedagic
Hi, On Wed, Feb 13, 2008 at 11:10:19AM -0500, Doug Lochart wrote: > I am starting to see how things work but still have a few questions. > I am still waiting for my 1 terrabyte resource to sync with DRBD so I > was going to address my heartbeat configuration. Now I have installed > the 2.1.3-3 Ce

Re: [Linux-HA] 'Shutdown delayed' preventing heartbeat from shutting down

2008-02-19 Thread Dejan Muhamedagic
Hi, On Wed, Feb 13, 2008 at 11:40:05AM -0500, Brian Reichert wrote: > We're running heartbeat 2.0.8 under RHEL4, and in our QA environment, > were seeing this sequence of log messages: > > Feb 12 21:18:23 vdev-3230 heartbeat: [6941]: ERROR: Message hist queue is > filling up (200 messages in

Re: [Linux-HA] resources not in a group and colocation constraints

2008-02-19 Thread Abraham Iglesias
Hi Dejan, I will look into rsc_order more accurately tomorrow. By the way, I thought also in a chain of colocation constraints, but i read that constraints are not bidirectional. That is, if resource1 must run on the same node that resource2, then if resource1 fails and moves to the other node

Re: [Linux-HA] 4node setup with constraints and clones but ...

2008-02-19 Thread www.tiri.li high availability
Hi Dejan, thanks for your answer. I need the following -- is this possible with heartbeat ? how? ON FIRST START the httpd on all nodes MAY ONLY BE STARTED if ALL MYSQL on ALL NODES are running (I need this for testing purposes and initialization) (how can be defined to start mysql on

Re: [Linux-HA] 'Shutdown delayed' preventing heartbeat from shutting down

2008-02-19 Thread Brian Reichert
On Tue, Feb 19, 2008 at 08:53:55PM +0100, Dejan Muhamedagic wrote: > Hard to say, but it is highly recommended to update to 2.1.3. > Those "hist queue filling up" show up from time to time and are > related to various communication problems. Sorry that I can't be > more specific. It's one of the gr

Re: [Linux-HA] Split Brain and not able to repair

2008-02-19 Thread Doug Lochart
On Feb 19, 2008 2:32 PM, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > Hi, > > On Tue, Feb 19, 2008 at 12:07:27PM -0500, Doug Lochart wrote: > > I feel your pain. I suffered through this as well as I am just > > learning. I was following a few tutorials and followed them closely > > only to end

Re: [Linux-HA] stonith on an apcmaster

2008-02-19 Thread Michael Brennen
On Tuesday 19 February 2008, Dave Blaschke wrote: > Dejan Muhamedagic wrote: > > Hi, > > > > On Tue, Feb 19, 2008 at 03:38:46AM -0600, Michael Brennen wrote: > >> On Sun, 17 Feb 2008, Michael Brennen wrote: > >>> Heartbeat 2.1.3, crm enabled > >>> > >>> I've built an initial drbd master/slave on tw

[Linux-HA] Re: RE: Re: About pgsql RA.

2008-02-19 Thread HIDEO YAMAUCHI
Hi Serge, I confirmed your patch. There was not a problem with the next case. 1)Started in one node of the resource of one PostgreSQL 2)Started in one node of the resource of two PostgreSQL(pgdata and port are two different PostgreSQL.) But, there was a problem with the next case. 3)When ther

Re: [Linux-HA] drbd heartbeat v2

2008-02-19 Thread Dominik Klein
crm_verify[19814]: 2008/02/19_08:46:57 WARN: unpack_rsc_op: Processing failed op drbd0:1_start_0 on cn2-inverness-co: Error crm_verify[19814]: 2008/02/19_08:46:57 WARN: unpack_rsc_op: Compatability handling for failed op drbd0:1_start_0 on cn2-inverness-co crm_verify[19814]: 2008/02/19_08:46:57 WA