Re: [Linux-HA] Perpeptual Newbie Question - gfs2 active/active

2010-09-15 Thread Andrew Beekhof
On Wed, Sep 15, 2010 at 8:12 AM, Peter Larsen wrote: > On Wed, 2010-09-15 at 00:09 +0200, Lars Ellenberg wrote: >> >> Use that clusters-from-scratch guide you talked about earlier, >> I'm referring to >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Skip the drbd stuff. > > I did -

Re: [Linux-HA] Cloning for STONITH for 2 node configuration

2010-09-13 Thread Andrew Beekhof
On Thu, Sep 9, 2010 at 8:38 AM, harish tene wrote: > Hello, > > > > I have few doubts regarding HA Cloning and using STONITH and would be > grateful if you can help me. > > > > I have a cluster of two server nodes and use ILO device for power cycling > nodes. I went through few threads in mail-arc

Re: [Linux-HA] Help with recource location constraint

2010-09-13 Thread Andrew Beekhof
On Fri, Sep 10, 2010 at 12:48 AM, Phillips, William G (BPHILLIP) wrote: > Hi, > > I'm running a Pacemaker 0.6 two-node active/passive cluster (plan to migrate > to latest in the next couple of months when my management will allow it). I > have a group resource that runs on the active node and a cl

Re: [Linux-HA] Postfix one node, sendmail, on other nodes ?

2010-09-13 Thread Andrew Beekhof
On Mon, Sep 13, 2010 at 11:05 AM, Daniel Machado Grilo wrote: >  Hi, > > we have a cluster of several nodes, and one primitive is "POSTFIX" that can > run in any node.. > > Is there a way to configure something like: >   - If postfix runs on node X at this time, run sendmail on the other node; > >

Re: [Linux-HA] stonith resource

2010-09-05 Thread Andrew Beekhof
rps10 ssh suicide wti_mpcwti_nps > > Thanks a lot for your help, > Ivan > > * Andrew Beekhof [Thu, 2 Sep 2010 08:49:41 +0200]: >> crm_resource --list  should do it if I understand the question > correctly >> >> On Mon, Aug 30, 2010 at

Re: [Linux-HA] How to temporarily disable all HA mechanism without stopping Pacemaker ?

2010-09-05 Thread Andrew Beekhof
On Fri, Sep 3, 2010 at 4:28 PM, Alain.Moulle wrote: > And by the way, if maintenance-mode=false, does it also disable > stonith ? You mean "maintenance-mode=true"? No. That is controlled by the stonith-enabled option. > Thanks > Alain >> Hi again, >> I just found the parameter "maintenance-mode

Re: [Linux-HA] stonith resource

2010-09-01 Thread Andrew Beekhof
crm_resource --list should do it if I understand the question correctly On Mon, Aug 30, 2010 at 10:16 AM, Ivan Gromov wrote: > Dear all, > > Is it possible to get name of stonith (clone id) resource from some > command? For instance, I have cluster with stonith like presented below. > I want to

Re: [Linux-HA] Help with recource location constraint

2010-08-27 Thread Andrew Beekhof
On Mon, Aug 23, 2010 at 2:01 PM, Phillips, William G (BPHILLIP) wrote: > Hi, > > I'm running Pacemaker 0.6 two-node active/passive cluster (plan to migrate > to latest in the next couple of months when my management will allow it). I > have a group resource that runs on the active node and a clone

Re: [Linux-HA] Temporarily stop monitoring a resource

2010-08-26 Thread Andrew Beekhof
On Thu, Aug 12, 2010 at 12:18 AM, Bart Coninckx wrote: > Hi, > > We're using the Xen resource agents with an operation "monitor" that > repeats every 60 seconds. > For backing up the Xen machines, we use "xm save" and "xm restore" > which takes them offline for a short amount of time (and copies t

Re: [Linux-HA] Stopping colocation resources?

2010-08-26 Thread Andrew Beekhof
On Wed, Aug 18, 2010 at 5:09 PM, Aaron Cline wrote: > Hi: > > Hopefully these are a couple of easy questions, but I haven't really found a > good way to do what I'm looking for.  I'm using heartbeat 3.0.3 and > pacemaker 1.0.9.  Below is my config.  I have 4 IPaddr2 resources that are > grouped wi

Re: [Linux-HA] Pacemaker RA for ocfs2 history ?

2010-08-24 Thread Andrew Beekhof
On Wed, Aug 18, 2010 at 5:01 PM, Alain.Moulle wrote: > Hi Dejan > > just some last questions on this subject : > > 1/ who has the responsability of the pcmk Stack ? > >        is it someone from Pacemaker ? >        or Oracle ? I imagine its a joint effort. They provide an API for supplying ocfs2

Re: [Linux-HA] Question around ocf:pacemaker:oc2b

2010-08-24 Thread Andrew Beekhof
On Tue, Aug 17, 2010 at 3:46 PM, Alain.Moulle wrote: > Hi, > > OK I finally got clone-o2cb started on both nodes and I manually can mount > all my ocfs2 FS on both sides, but just to see, I launch a test which on > both > nodes in parallel which loops on the mount of all my ocfs2 FS , and then > u

Re: [Linux-HA] Starting over with Pacemaker / hoping to make somedocs

2010-08-24 Thread Andrew Beekhof
On Mon, Aug 16, 2010 at 4:02 PM, Peter Sylvester wrote: > Hey guys. > > Unfortunatly on Friday my client decided to scrap the idea of going with > pacemaker.  I do appreciate all of your help and have taken plenty of notes > on implementation and such and if you guys are interested I can still wri

Re: [Linux-HA] time to fork heartbeat?

2010-08-16 Thread Andrew Beekhof
On Mon, Aug 16, 2010 at 7:30 PM, Dimitri Maziuk wrote: > On Thursday 12 August 2010 19:32, Lars Ellenberg wrote: >> > On Wed, Aug 11, 2010 at 03:59:34PM -0700, David Lang wrote: > >> > > >> This is really starting to sound like we need to fork heartbeat back >> > > >> to the 2.x or thereabouts whe

Re: [Linux-HA] Starting with Pacemaker

2010-08-09 Thread Andrew Beekhof
On Fri, Aug 6, 2010 at 10:11 PM, Peter Sylvester wrote: > Hey guys.  I figured out what was wrong with my code yesterday, but the > bad/good (depending) is that the client is now leaning towards using > pacemaker in addition to heartbeat. > > I was wondering if anyone knew of any good documentatio

Re: [Linux-HA] Advice on how to configure two "basic" servers

2010-08-05 Thread Andrew Beekhof
On Thu, Aug 5, 2010 at 12:12 PM, Gary Sedgwick wrote: > Hi, > > I have two identical basic Dell Poweredge R300 servers - when I say > "basic", I mean no DRAC card, no redundant power supplies, etc.  Each has > a smaller disk (120GB) and larger disk (1TB), as well as 2 NICs.  I'm > planning to run

Re: [Linux-HA] Getting Heartbeat to start - connect to /usr/lib/heartbeat/stonithd

2010-08-04 Thread Andrew Beekhof
he office today but they are versions from the cluster labs repo >> >> [clusterlabs] >> name=High Availability/Clustering server technologies (fedora-13) >> baseurl=http://www.clusterlabs.org/rpm/fedora-13 >> type=rpm-md >> gpgcheck=0 >> enabled=1 >> >&

Re: [Linux-HA] IPaddr2 in ClusterIP mode fail-back causes complete loss of connectivity to IP.

2010-08-04 Thread Andrew Beekhof
On Wed, Aug 4, 2010 at 12:56 AM, Brett Delle Grazie wrote: > Hi, > > I have two nodes (RHEL 5.5) configured in a cluster > (Corosync/Pacemaker). > > I have an IPaddr2 and Apache resources configured as clones on those > systems (configuration is shown below). > > The IPaddr2 is configured for load

Re: [Linux-HA] how to use mac hadware with ipaddr2

2010-08-04 Thread Andrew Beekhof
On Mon, Aug 2, 2010 at 6:09 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Jul 27, 2010 at 12:21:06PM +0200, Trujillo Carmona, Antonio wrote: >> >> I'm try to use Ipaddr2 in order to have the same MAC ethernet in a two >> node corosync cluster but I can't make it work > > It doesn't work how? > >>

Re: [Linux-HA] Problems with 3rd cluster member

2010-08-04 Thread Andrew Beekhof
Nowhere near enough logs included to make any informed comment on what the problem might be. On Tue, Aug 3, 2010 at 11:08 PM, Aaron Cline wrote: > Hi all: > > I'm trying to setup a 3 member cluster to do HTTP load balancing.  The > cluster members are in a public cloud where I can't use multi-cas

Re: [Linux-HA] Getting Heartbeat to start - connect to /usr/lib/heartbeat/stonithd

2010-08-02 Thread Andrew Beekhof
On Sun, Aug 1, 2010 at 5:21 PM, Jason Fitzpatrick wrote: > Hi All > > I should clarify this a bit, > > I have just upgraded a test cluster from FC8 to FC13 and with it the > latest version of Heartbeat / Pacemaker / Corosync (cluster  is > heartbeat / pacemaker, corosync is installed but not runni

Re: [Linux-HA] How to get list of clones

2010-07-29 Thread Andrew Beekhof
On Wed, Jul 28, 2010 at 7:06 PM, Ivan Gromov wrote: > Hello, everyone > > In my perl script, which looks after the cluster, I use crm_failcount -G > -U node_id -r resource_id command to get failcount of the resource_id. > But I have a problem with clones because I don't know how to get list of > c

Re: [Linux-HA] Replacement for quorumd

2010-07-26 Thread Andrew Beekhof
On Wed, Jul 21, 2010 at 6:50 PM, Benjamin Lawetz wrote: > Hello all, > >     I've been using heartbeat for a couple of years now to manage a > couple of virtual IPs for replicating Mysql servers. I have to redo a > similar setup now and was looking to use the new heartbeat 3. > > I was looking for

Re: [Linux-HA] Setting up Linux-HA on Ubuntu

2010-07-26 Thread Andrew Beekhof
try the clusters from scratch document: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf its written for fedora but 99.9% will be unchanged for ubuntu On Wed, Jul 21, 2010 at 5:58 PM, Igor Chudov wrote: > I set up Linux-HA on Ubuntu Lucid. > > I followed instructions on this page: > ht

Re: [Linux-HA] What does "Failed Actions: Not Installed" Mean?

2010-07-21 Thread Andrew Beekhof
On Wed, Jul 21, 2010 at 1:25 PM, Robinson, Eric wrote: >> Think about what "not installed" might mean. > > Seriously, dude, I've been up through the night and maybe I'm not > thinking clearly. Have some mercy. On the surface "not installed" means > something wasn't installed. But what? Something

Re: [Linux-HA] What does "Failed Actions: Not Installed" Mean?

2010-07-21 Thread Andrew Beekhof
Think about what "not installed" might mean. On Wed, Jul 21, 2010 at 12:46 PM, Robinson, Eric wrote: > When I do... > >        crm resource start p_MySQL_173 > > crm_mon shows... > >        p_MySQL_173    (ocf::heartbeat:mysql_173):     Started > ha06.mydomain.com > >        Failed actions: >    

Re: [Linux-HA] info messages re drbd resources

2010-07-21 Thread Andrew Beekhof
On Wed, Jul 21, 2010 at 9:52 AM, Matt wrote: > On 21 July 2010 07:14, Andrew Beekhof wrote: >> On Tue, Jul 20, 2010 at 6:04 PM, Matt wrote: >>> I've set up a two node cluster with >>> drbd 8.3.8-1.el5.centos >>> pacemaker 1.0.5-4.1 >>> heartbeat

Re: [Linux-HA] info messages re drbd resources

2010-07-20 Thread Andrew Beekhof
On Tue, Jul 20, 2010 at 6:04 PM, Matt wrote: > I've set up a two node cluster with > drbd 8.3.8-1.el5.centos > pacemaker 1.0.5-4.1 > heartbeat 3.0.0-33.2 > > Everything seems to be working correctly, I can move resources about > with no problems.  I'm getting lots of info messages in > /var/log/me

Re: [Linux-HA] Order rule ignored when starting a clone

2010-07-15 Thread Andrew Beekhof
Fixed in http://hg.clusterlabs.org/pacemaker/1.1/rev/aea182c3c930 ready for backport On Thu, Jul 15, 2010 at 11:31 AM, Andrew Beekhof wrote: > On Wed, Jul 14, 2010 at 5:48 PM, Matthew Richardson > wrote: >> Andrew Beekhof wrote: >>> I need the xml from which the dot graph

Re: [Linux-HA] Order rule ignored when starting a clone

2010-07-15 Thread Andrew Beekhof
On Wed, Jul 14, 2010 at 5:48 PM, Matthew Richardson wrote: > Andrew Beekhof wrote: >> I need the xml from which the dot graph was generated. Sorry. >> > > The xml for the above configuration (crm configure show xml). Not quite, there's no status section. But I think I

Re: [Linux-HA] Order rule ignored when starting a clone

2010-07-14 Thread Andrew Beekhof
On Thu, Jul 8, 2010 at 4:16 PM, Matthew Richardson wrote: > It appears that when creating a clone as the second resource in an order > rule, the rule is ignored when deciding when to start the clone. > > I'm using Pacemaker 1.0.9.1 - I've cut it down to the following simple > example, but it seems

Re: [Linux-HA] Two curious problems with heartbeat and ldirector

2010-07-12 Thread Andrew Beekhof
On Fri, Jul 9, 2010 at 4:25 PM, Schaefer, Dirk Alexander wrote: > Hi, > > well, that's a good to know information ;) it's latest version offered by > gentoo's package manager. I believe they've made 3.0 and pacemaker since last week. The people involved hang out on #gentoo-cluster on freenode i

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-07-07 Thread Andrew Beekhof
On Wed, Jul 7, 2010 at 9:22 AM, Alain.Moulle wrote: > Hi, > ok , but of course without both pcmk... parameters Why "of course"? There is no point testing without those. > as crm returns ERROR > for these parameters. You'll likely need to use -f or --force to tell the crm to accept the changes a

Re: [Linux-HA] two nodes insolate

2010-07-06 Thread Andrew Beekhof
On Tue, Jul 6, 2010 at 1:01 PM, Trujillo Carmona, Antonio wrote: > > I'm try to setup a 2 nodes cluster for HA, after configure it I began to test > it but fail. > I configured a ping node and it got offline always. > I try to configure a ping resource and neither it work. > always I got: > -

Re: [Linux-HA] Question about HA NFSv4 and Pacemaker

2010-07-06 Thread Andrew Beekhof
On Tue, Jul 6, 2010 at 10:32 AM, Alain.Moulle wrote: > Hi, > > a general question about NFS v4 and Pacemaker; it seems that NFSv4 has > a native HA functionnality but : > > Is it better to use this native functionnality ? > > Or to configure the HA of NFSv4 server under Pacemaker with nfsserver sc

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-07-06 Thread Andrew Beekhof
On Tue, Jul 6, 2010 at 9:25 AM, Alain.Moulle wrote: > Hi, > what can I send to you to complete information ? > (nothing in syslog) > Tell me , and I'll send to you for sure, because it is > an important issue for me. I need the pacemaker/corosync logs from the time when the cluster tried to shoot

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-07-05 Thread Andrew Beekhof
On Mon, Jul 5, 2010 at 3:54 PM, Alain.Moulle wrote: > Hi Andrew > > It seems not to work : > crm configure primitive restofencenode3 stonith:fence_ipmilan params > ipaddr='BMC ipaddr of node3' login='mylogin' passwd='mypasswd' > action='reboot'     pcmk_host_check=static-list > pcmk_host_list="nod

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-07-05 Thread Andrew Beekhof
On Mon, Jul 5, 2010 at 2:55 PM, Alain.Moulle wrote: > Hi Andrew, > > sorry for the delay but I had a HW pb on my server. > > Now it is fixed and I'd like to try the new parameters you gave to me, > but I don't understand > the parameter : pcmk_host_list="whitespace list of hosts the device > contr

Re: [Linux-HA] Heartbeat Clone with vIPs constraints

2010-07-02 Thread Andrew Beekhof
On Fri, Jul 2, 2010 at 1:54 PM, Xeno1234 wrote: > > > Andrew Beekhof-3 wrote: >> >> >>> You should really think about upgrading to 3.0 + Pacemaker 1.0 (that >>> where the crm lives now). >>>   http://www.clusterlabs.org/rpm/   <--- EPEL =~ RHEL

Re: [Linux-HA] Heartbeat Clone with vIPs constraints

2010-07-01 Thread Andrew Beekhof
On Wed, Jun 30, 2010 at 3:10 PM, Xeno1234 wrote: > > Hi, > > I am currently trying to setup a cluster using heartbeat. Unfortunatly it > does not do what I want to do. I am using heartbeat 2.1.4-11 on a readhat > system. You should really think about upgrading to 3.0 + Pacemaker 1.0 (that where t

Re: [Linux-HA] Permission Issues when CRM tries to sign into heartbeat process

2010-07-01 Thread Andrew Beekhof
On Wed, Jun 30, 2010 at 5:18 PM, Harakiri wrote: > >> > apiauth default gid=haclient >> > >> > so basically i read another thread that there could be >> a gid-> name mapping problem. Im not certain which of the >> above lines are enough to fix it (i guess the last one?!). >> >> Yep, last one is th

Re: [Linux-HA] Permission Issues when CRM tries to sign into heartbeat process

2010-06-30 Thread Andrew Beekhof
On Tue, Jun 29, 2010 at 2:06 PM, Harakiri wrote: > HB Version 2.14 (yes sorry, cant upgrade) on Sparc Solaris10 > > Im having an issue that crm is respawning: > > heartbeat[10117]: 2010/06/29_13:03:17 ERROR: Respawning client > "/opt/heartbeat/lib/heartbeat/attrd": > heartbeat[10117]: 2010/06/29_1

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-06-29 Thread Andrew Beekhof
On Tue, Jun 29, 2010 at 1:48 PM, Alain.Moulle wrote: > > /usr/sbin/fence_ipmilan -a -A MD5 -p mypass -l > mylogin -o list -v > N/A > /usr/sbin/fence_ipmilan -a -A NONE -p mypass -l > mylogin -o list -v > N/A > /usr/sbin/fence_ipmilan -a -A PASSWORD -p mypass > -l mylogin -o list -v > N/A I'm t

Re: [Linux-HA] configuration pb on RHEL6 :withstonithonexternal/ipmi

2010-06-29 Thread Andrew Beekhof
On Tue, Jun 29, 2010 at 11:09 AM, Alain.Moulle wrote: > Hi, > > Before tests with pacemaker and fence_ipmilan, I tried fence_ipmilan > with option reboot and it works fine, node2 has rebooted. > > And during test with pacemaker/fence_ipmilan, I don't see any trace in > syslog > of the call of fenc

Re: [Linux-HA] configuration pb on RHEL6 : withstonithonexternal/ipmi

2010-06-28 Thread Andrew Beekhof
On Thu, Jun 24, 2010 at 2:35 PM, Alain.Moulle wrote: > Hi again, > > no way to make fence working on RHEL6 : > > I have 3 nodes in cluster and the status before the test is (crm_mon) : > > Online: [ node1 node2 node3 ] > > restofencenode1        (stonith:fence_ipmilan):        Started node2 > rest

Re: [Linux-HA] configuration pb on RHEL6 : with stonithonexternal/ipmi

2010-06-23 Thread Andrew Beekhof
On Wed, Jun 23, 2010 at 1:52 PM, Dejan Muhamedagic wrote: > On Tue, Jun 22, 2010 at 09:56:42PM +0200, Andrew Beekhof wrote: >> On Tue, Jun 22, 2010 at 9:50 PM, Andrew Beekhof wrote: >> > On Tue, Jun 22, 2010 at 5:04 PM, Dejan Muhamedagic >> > wrote: >> >&g

Re: [Linux-HA] configuration pb on RHEL6 : with stonithonexternal/ipmi

2010-06-23 Thread Andrew Beekhof
On Wed, Jun 23, 2010 at 10:02 AM, Alain.Moulle wrote: > Hi, > I'm trying to configure : > crm configure primitive NEWrestofencedevha1 stonith:fence_ipmilan params > ipaddr=xxx.xxx.xxx.xxx login=mylogin passwd=mypasswd action=reboot  meta > target-role=Stopped > > and it returns some warnings : > W

Re: [Linux-HA] configuration pb on RHEL6 : with stonithonexternal/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 9:50 PM, Andrew Beekhof wrote: > On Tue, Jun 22, 2010 at 5:04 PM, Dejan Muhamedagic > wrote: >> Hi, >> >> On Tue, Jun 22, 2010 at 04:02:10PM +0200, Andrew Beekhof wrote: >>> On Tue, Jun 22, 2010 at 4:01 PM, Alain.Moulle wrote: >>&g

Re: [Linux-HA] configuration pb on RHEL6 : with stonithonexternal/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 5:04 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Jun 22, 2010 at 04:02:10PM +0200, Andrew Beekhof wrote: >> On Tue, Jun 22, 2010 at 4:01 PM, Alain.Moulle wrote: >> > Ooops, sorry again, it seems that the rpm with RH fence methods was >> &g

Re: [Linux-HA] configuration pb on RHEL6 : with stonithonexternal/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 4:01 PM, Alain.Moulle wrote: > Ooops, sorry again, it seems that the rpm with RH fence methods was > not installed ... You'll also need different parameters. Try: stonith_admin --metadata --agent fence_ipmilan (Seems "crm ra info stonith:fence_ipmilan" isn't working f

Re: [Linux-HA] configuration pb on RHEL6 : with stonith onexternal/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 2:29 PM, Alain.Moulle wrote: > Hi Dejan and Andrew, > > many thanks, that's a quite unexpected change for me ... but you're right : > > crm ra info stonith:external/ipmi > /bin/sh: stonith: command not found > > rpm -qplv cluster-glue-1.0.5-1.el6.x86_64.rpm | grep stonith >

Re: [Linux-HA] configuration pb on RHEL6 : with stonith on external/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 2:09 PM, Dejan Muhamedagic wrote: > RHEL6 doesn't include the stonith plugins from glue? Correct. Pretty sure I mentioned this previously. > Support issue? No. Policy decision to reduce QE load. No reason to support two sets of things that do the same job. __

Re: [Linux-HA] configuration pb on RHEL6 : with stonith on external/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 1:21 PM, Dejan Muhamedagic wrote: > Hi, > > On Tue, Jun 22, 2010 at 11:46:20AM +0200, Alain.Moulle wrote: >> Hi >> >> It seems that there is a pb on RHEL6 pacemaker release: >> pacemaker-1.1.2-2.el6.x86_64 >> >> crm configure primitive restofencenode1 stonith:external/ipmi

Re: [Linux-HA] configuration pb on RHEL6 : with stonith on external/ipmi

2010-06-22 Thread Andrew Beekhof
On Tue, Jun 22, 2010 at 11:46 AM, Alain.Moulle wrote: > Hi > > It seems that there is a pb on RHEL6 pacemaker release: > pacemaker-1.1.2-2.el6.x86_64 > > crm configure primitive restofencenode1 stonith:external/ipmi params RHEL6 does not include the linux-ha fence agents. You'll need to use the o

Re: [Linux-HA] IPaddr2 unique_clone_address

2010-06-14 Thread Andrew Beekhof
Yep. IIRC it used to work. Haven't used it in a long time though ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] upgrade problem

2010-06-10 Thread Andrew Beekhof
On Thu, Jun 10, 2010 at 10:23 AM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Jun 09, 2010 at 10:09:33PM -0400, Miles Fidelman wrote: >> Hi Folks, >> >> I'm not sure if this belongs on the linux-ha or pacemaker list, so >> >> I just upgraded a Debian Lenny cluster from hearbeat2 to >> heartbeat

Re: [Linux-HA] Pacemaker/Corosync Question

2010-06-10 Thread Andrew Beekhof
On Thu, Jun 10, 2010 at 12:33 AM, David wrote: > I have pacemaker/corosync installed on a 2 server cluster of CentOS 5.5 > boxes.  Currently these boxes are setup with an iSCSI SAN volume, OCFS2 > file system and pacemaker is configured to manage Apache active/active. > > Many of the how to docume

Re: [Linux-HA] Colocation, location, auto-failback=off

2010-06-08 Thread Andrew Beekhof
On Fri, Jun 4, 2010 at 11:25 PM, Tony Hunter wrote: > On Thu, Jun 03, 2010 at 07:18:16PM -0300, Diego Woitasen wrote: >> On Wed, Jun 2, 2010 at 7:43 AM, Andrew Beekhof wrote: >> >> > On Sat, May 29, 2010 at 3:54 AM, Diego Woitasen >> > wrote: >> > >

Re: [Linux-HA] Colocation, location, auto-failback=off

2010-06-02 Thread Andrew Beekhof
On Sat, May 29, 2010 at 3:54 AM, Diego Woitasen wrote: > Hi, >  * I have three nodes: "ha1", "ha2" y "ha3". >  * Three resources: "sfex", "xfs_fs", "ip". >  * "sfex" and "xfs_fs" are members of a group called "xfs_grp". >  * "xfs_grp" can run on any node but "ip" resource can run on "ha1" or > "ha

Re: [Linux-HA] Active-Active nfs storage

2010-05-31 Thread Andrew Beekhof
On Mon, May 31, 2010 at 4:22 PM, RaSca wrote: > Hi all, > I have a cluster with two nodes configured to mount two drbd, with LVM > and filesystem. I need to put each drbd on a different node, for an > active-active setup, like a storage, so I have two groups like these: > > group share-a share-a-i

Re: [Linux-HA] Colocation, location, auto-failback=off

2010-05-30 Thread Andrew Beekhof
On Sat, May 29, 2010 at 3:54 AM, Diego Woitasen wrote: > Hi, >  * I have three nodes: "ha1", "ha2" y "ha3". >  * Three resources: "sfex", "xfs_fs", "ip". >  * "sfex" and "xfs_fs" are members of a group called "xfs_grp". >  * "xfs_grp" can run on any node but "ip" resource can run on "ha1" or > "ha

Re: [Linux-HA] RHEL5 Heartbeat v2 with DRBD/NFS active/standby failover

2010-05-27 Thread Andrew Beekhof
Try something more recent from http://www.clusterlabs.org/rpm It has a script called nfsserver by the looks of it On Wed, May 26, 2010 at 5:15 PM, Alasdair Gow wrote: > Hi All, > > I'm trying to setup Heartbeat v2 with DRBD and NFS. > > I have added the epel repo to get heartbeat and sourced DRB

Re: [Linux-HA] stderr reboot

2010-05-27 Thread Andrew Beekhof
On Wed, May 26, 2010 at 6:05 PM, Sam Reidland wrote: > I have been working on a simple 2 node 2 resource cluster using > Pacemaker 1.0.7 and heartbeat 3.0.2. The two resources are IPaddr and > our application. When our application was started, the box would reboot > (actually a clean restart). Aft

Re: [Linux-HA] Pb with last Pacemaker and corosync releases available for RHEL5 ?

2010-05-25 Thread Andrew Beekhof
udit.so.0...(no debugging symbols found)...done. > Loaded symbols for /lib/libaudit.so.0 > Reading symbols from /lib/libnss_files.so.2...(no debugging symbols > found)...done. > Loaded symbols for /lib/libnss_files.so.2 > Core was generated by `corosync'. > Program terminate

Re: [Linux-HA] Harmless log entries

2010-05-21 Thread Andrew Beekhof
On Thu, May 20, 2010 at 3:30 PM, mike wrote: > Gianluca Cecchi wrote: >> On Thu, May 20, 2010 at 2:45 PM, mike wrote: >> >> >>> ok, I actually went ahead and did a test on my cluster. The results did >>> not occur as I would have expected. >>> >>> I failed ldirectord twice on the main node. I wai

Re: [Linux-HA] Pb with last Pacemaker and corosync releases available for RHEL5 ?

2010-05-21 Thread Andrew Beekhof
is there a core file in /var/lib/corosync? On Fri, May 21, 2010 at 11:57 AM, Alain.Moulle wrote: > Hi, > > FYI , it was working fine with : > corosync-1.2.1-1.el5 > corosynclib-1.2.1-1.el5 > pacemaker-1.0.8-6.el5 > pacemaker-libs-1.0.8-6.el5 > > then I update to : > corosync-1.2.2-1.1.el5 > coros

Re: [Linux-HA] Harmless log entries

2010-05-19 Thread Andrew Beekhof
On Wed, May 19, 2010 at 2:49 PM, Vadym Chepkov wrote: > > On May 19, 2010, at 8:36 AM, mike wrote: > >> I assume Andrew means 15 minutes * 60 = 900 seconds * 1000 = 90 >> milliseconds > > I gathered that much, I am just surprised, that's it. Do I have to always > specify time units to be cert

Re: [Linux-HA] Harmless log entries

2010-05-19 Thread Andrew Beekhof
On Wed, May 19, 2010 at 11:30 AM, Gianluca Cecchi wrote: > On Wed, May 19, 2010 at 10:17 AM, Andrew Beekhof wrote: > >>  > Also, in monitor available fields for a resource there are: >> > >> > - interval, default 0 >> > Does it mean no monitor at all

Re: [Linux-HA] Harmless log entries

2010-05-19 Thread Andrew Beekhof
On Wed, May 19, 2010 at 5:22 PM, mike wrote: > Andrew Beekhof wrote: >>> which is what my DBA was looking for. He wants mysql to failover if >>> there are 3 successive failures of MySQL but only if those successive >>> failures occur within 15 minutes. >>> &

Re: [Linux-HA] Harmless log entries

2010-05-19 Thread Andrew Beekhof
On Wed, May 19, 2010 at 9:25 AM, Gianluca Cecchi wrote: > On Wed, May 19, 2010 at 8:51 AM, Andrew Beekhof wrote: > >> On Tue, May 18, 2010 at 2:05 PM, mike wrote: >> > So now that I have a few clusters up and running after a few problems >> > I've st

Re: [Linux-HA] Harmless log entries

2010-05-18 Thread Andrew Beekhof
On Tue, May 18, 2010 at 2:05 PM, mike wrote: > So now that I have a few clusters up and running after a few problems > I've started looking at the logs with some regularity. I'm hoping > someone can confirm my thoughts on some entries in the ha-log. > > 1. PEngine Recheck Timer (I_PE_CALC) just po

Re: [Linux-HA] Initial resource location

2010-05-17 Thread Andrew Beekhof
On Mon, May 17, 2010 at 9:46 AM, RaSca wrote: > > What should else can i check? logs, cibadmin output. the usual stuff ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.or

Re: [Linux-HA] Initial resource location

2010-05-16 Thread Andrew Beekhof
On Sat, May 15, 2010 at 10:11 PM, RaSca wrote: > Il giorno Sab 15 Mag 2010 20:31:22 CET, Andrew Beekhof ha scritto: >> >> If the first node is up long enough to start resources, then >> default-resource-stickiness="INFINITY" is going to stop then from >>

Re: [Linux-HA] Resources's constraints and movements question

2010-05-15 Thread Andrew Beekhof
On Mon, May 10, 2010 at 12:09 PM, Gianluca Cecchi wrote: > Hello, > suppose I have group of resources named G1 and a resource named R2. > I define an order R2 after G1 and a colocation constraint of -inf so > that they run on different nodes (2 nodes overall). > At runtime I have G1 on node1 and R

Re: [Linux-HA] Initial resource location

2010-05-15 Thread Andrew Beekhof
If the first node is up long enough to start resources, then default-resource-stickiness="INFINITY" is going to stop then from being moved to satisfy your location constraints. On Fri, May 14, 2010 at 1:52 PM, RaSca wrote: > Il giorno Ven 14 Mag 2010 13:34:35 CET, Andrew Beekh

Re: [Linux-HA] Initial resource location

2010-05-14 Thread Andrew Beekhof
On Fri, May 14, 2010 at 12:55 PM, RaSca wrote: > Hi all, > why even if I've declared these location directives: > > location cli-prefer-share-a share-a \ >        rule $id="cli-prefer-rule-share-a" 200: #uname eq ubuntu-nodo1 > location cli-prefer-share-b share-b \ >        rule $id="cli-prefer-ru

Re: [Linux-HA] Difference between move and migrate a resource

2010-05-13 Thread Andrew Beekhof
On Thu, May 13, 2010 at 10:20 AM, RaSca wrote: > Hi all, > is there a difference between these two commands: > > crm resource migrate share-a node1 > > crm resource move share-a node1 > > both of them put the resource on node1 and both of them automatically > add a location constraint in the clust

Re: [Linux-HA] crm_failcount , set failcount for resource on the node

2010-05-12 Thread Andrew Beekhof
On Tue, May 11, 2010 at 1:20 PM, Ivan Gromov wrote: > Hello, > > I have strange behaviour of crm_failcount. > For instance, I have two nodes and resource res1 started on node1. I > want to change (for some reason) failcount on another node2. So, I carry > out command on node1: crm_failcount -N nod

Re: [Linux-HA] How to tune the timer before fencing ?

2010-05-12 Thread Andrew Beekhof
There is no timer in pacemaker, sorry. We are event driven. At best you'd configure the underlying messaging layer to allow a generous delay if the node is not responding. Like deadtime on heartbeat. On Wed, May 12, 2010 at 11:45 AM, Alain.Moulle wrote: > Hi, > > Just to be sure, I would like to

Re: [Linux-HA] Issues with Heartbeat/DRBD over Internet connection

2010-05-11 Thread Andrew Beekhof
On Tue, May 11, 2010 at 10:35 PM, Mike Sweetser wrote: > Hello, > > I've set up a DRBD and Heartbeat configuration communicating over an > Internet connection, rather than internal.  The servers are running CentOS > 5.4, with DRBD 8.3.2 and Heartbeat 3.0.3, out of the CentOS repository. > > I star

Re: [Linux-HA] iptables?

2010-05-10 Thread Andrew Beekhof
Thats because you're posting on the heartbeat list. But IIRC, you need $port and $port + 1 open for UDP, where $port is from corosync.conf No I don't have the commands you can paste into a terminal. On Mon, May 10, 2010 at 5:39 PM, Brodie, Kent wrote: > Hi-- I posted a query a week ago, didn't h

Re: [Linux-HA] Heartbeat vs OpenAIS

2010-05-06 Thread Andrew Beekhof
On Thu, May 6, 2010 at 9:29 AM, Florian Haas wrote: > On 05/06/2010 08:59 AM, Andrew Beekhof wrote: >> About the only time I start heartbeat is for a few days before a release. >> And even then only for 1.0 releases, 1.1 is only tested against corosync. >> >>> Proba

Re: [Linux-HA] Heartbeat vs OpenAIS

2010-05-06 Thread Andrew Beekhof
On Tue, May 4, 2010 at 9:30 PM, Florian Haas wrote: > On 05/04/2010 07:30 PM, Lars Marowsky-Bree wrote: >> On 2010-04-25T11:39:10, Smaïne Kahlouch wrote: >> >>> Do we have to move from Heartbeat to OpenAIS ? Now or in the future ? >>> What are the differences between these two project ? >>> Will

Re: [Linux-HA] A good cibadmin guide

2010-05-05 Thread Andrew Beekhof
Try something up at: http://www.clusterlabs.org/doc On Wed, May 5, 2010 at 10:41 PM, mike wrote: > Hi guys, > I wonder if someone might be able to point me to a good cibadmin guide. > Maybe its something someone wrote on their own, I really am not picky > here. I would like to get my hands on a

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Mon, May 3, 2010 at 10:57 AM, Gianluca Cecchi wrote: > On Mon, May 3, 2010 at 10:51 AM, Andrew Beekhof wrote: > >> On Mon, May 3, 2010 at 10:23 AM, Gianluca Cecchi >> wrote: >> > On Mon, May 3, 2010 at 9:22 AM, Andrew Beekhof >> wrote: >> > >>

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Mon, May 3, 2010 at 10:23 AM, Gianluca Cecchi wrote: > On Mon, May 3, 2010 at 9:22 AM, Andrew Beekhof wrote: > >> [snip] >> You would need to rebuild ocfs2-tools with pacemaker support turned on. >> >> Hmm, thaks for answering. > I did have the same idea/im

Re: [Linux-HA] MySQL and 4 instances

2010-05-03 Thread Andrew Beekhof
On Thu, Apr 29, 2010 at 7:37 PM, mike wrote: > Hello all, > > We had a simple 2 node MySQL cluster - nothing special. One instance > that worked perfectly. We recently added 3 instances and now we're > having some issues. The problem is that Heartbeat issues a MySQL Status > immediately after the

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Fri, Apr 30, 2010 at 4:43 PM, Gianluca Cecchi wrote: > Hello, > on rh el 5.5 trying to configure ocfs2 1.4 with pacemaker 1.0.8. > It seems I have some problems with programs/kernel modules missing. > > I downloaded rpm for pacemaker from clusterlabs repo and rpm for ocfs2 from > Oracle repo: >

Re: [Linux-HA] More than one drbd resource possible in pacemaker?

2010-05-03 Thread Andrew Beekhof
On Fri, Apr 30, 2010 at 9:46 AM, Gianluca Cecchi wrote: > Hello, > I have configured a drbd0 resource (nfsdata) in pacemaker, acting as > active/passive, using the linbit resource agent with master/slave config. > It works ok in different operations I tried with pacemaker. > > Then on both two nod

Re: [Linux-HA] Corosync shutdown hangs server

2010-04-30 Thread Andrew Beekhof
On Thu, Apr 29, 2010 at 8:21 PM, Brodie, Kent wrote: > Hi-- I'm playing with corosync/pacemaker in a 2-node setup (using > virtual machines..).   For the most part, I'm very impressed and it's > all very cool.  A big leap from 'heartbeat', that's for sure :-) > > I have cluster-ip addressing and a

Re: [Linux-HA] Pingd failed

2010-04-29 Thread Andrew Beekhof
On Mon, Apr 26, 2010 at 8:56 AM, Scheffler Heinz wrote: > Hello > > I configured pingd as a clone resource. Now pingd crashed and the > cluster did a failover. Can you show us the configuration of you pingd resource? > Real pingd messages with a failed network > resource looks different Our

Re: [Linux-HA] Setup cluster

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 5:37 PM, Gianluca Cecchi wrote: > On Tue, Apr 27, 2010 at 1:14 PM, Dejan Muhamedagic wrote: > >> [snip] >> > >> No, the advised values come from the resource agent's metadata. >> Those are the _minimums_ (at least so judged by the author of the >> resource agent) and they m

Re: [Linux-HA] Problem with LRM?

2010-04-26 Thread Andrew Beekhof
On Mon, Apr 26, 2010 at 10:07 AM, RaSca wrote: > Il giorno Lun 26 Apr 2010 09:19:43 CET, Alessandra Giovanardi ha scritto: >> Hi, >> I have a cluster with 2 nodes (with SUSE SLES 10 SP2 OS). > [...] >> Why my resource goes up only after this operation? >> I attach my cib.xml. >> Thank you >> Aless

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-20 Thread Andrew Beekhof
On Tue, Apr 20, 2010 at 5:43 PM, Alessandra Giovanardi wrote: > Anyway, I'm not so sure of the evolution under SUSE of this software: >  SUSE will include into SUSE Linux Enterprise Server 10 SP3 (x86_64) or > 11 (futher releases) also pacemaker (to replace heartbeat) or not? Starting with SLES1

Re: [Linux-HA] [Linux-ha-dev] Deprecated resource agents

2010-04-20 Thread Andrew Beekhof
On Tue, Apr 20, 2010 at 3:23 PM, Lars Marowsky-Bree wrote: > On 2010-04-19T23:04:42, Lars Ellenberg wrote: >> > Switching the ra type is, after all, another of those changes that >> > require a full restart of the resource (and thus service down-time). >> maintenance-mode=on >> s/ocf:heartbeat:d

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 19, 2010 at 12:15 PM, Alessandra Giovanardi wrote: > Andrew Beekhof wrote: >> On Fri, Apr 16, 2010 at 5:51 PM, Alessandra Giovanardi >> wrote: >> >>> Some times ago I performed the same operation via GUI on the same >>> cluster without pro

Re: [Linux-HA] oracle restart

2010-04-19 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 2:38 PM, Edi BELIC wrote: > Hi - We have two nodes - sles5 and sles6 . My oracle LSB resource is > running on node "sles5". There are constraint configured INFINITY on > node sles5  for  group_ora1. > > When I reboot, or restart heartbeat service on node sles6 also the > or

Re: [Linux-HA] [Linux-ha-dev] Deprecated resource agents

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 19, 2010 at 2:05 PM, Florian Haas wrote: > Hello, > > in case you haven't yet noticed: as of resource-agents 1.0.2, several > Linux-HA resource agents are marked as deprecated: > > - EvmsSCC and > - Evmsd (both apply to EVMS, which is no longer maintained); > > - LinuxSCSI (superseded

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-16 Thread Andrew Beekhof
On Fri, Apr 16, 2010 at 5:51 PM, Alessandra Giovanardi wrote: > Andrew Beekhof wrote: >> On Fri, Apr 16, 2010 at 9:49 AM, Alessandra Giovanardi >> wrote: >> >>> Hi, >>> I'm using  heartbeat-2.1.4-0.16.2 on a SUSE Linux Enterprise Server 10 SP3 >>

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-16 Thread Andrew Beekhof
On Fri, Apr 16, 2010 at 9:49 AM, Alessandra Giovanardi wrote: > Hi, > I'm using  heartbeat-2.1.4-0.16.2 on a SUSE Linux Enterprise Server 10 SP3 > (x86_64). > > My cluster is composed by two nodes: mdm01-mdm02, with two Resource Group: > > mdm01:~ # crm_mon -1 > > Last updated: Fri Ap

<    3   4   5   6   7   8   9   10   11   12   >