Re: [Linux-HA] Heartbeat vs OpenAIS

2010-05-06 Thread Andrew Beekhof
On Thu, May 6, 2010 at 9:29 AM, Florian Haas florian.h...@linbit.com wrote: On 05/06/2010 08:59 AM, Andrew Beekhof wrote: About the only time I start heartbeat is for a few days before a release. And even then only for 1.0 releases, 1.1 is only tested against corosync. Probably true, though

Re: [Linux-HA] More than one drbd resource possible in pacemaker?

2010-05-03 Thread Andrew Beekhof
On Fri, Apr 30, 2010 at 9:46 AM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: Hello, I have configured a drbd0 resource (nfsdata) in pacemaker, acting as active/passive, using the linbit resource agent with master/slave config. It works ok in different operations I tried with pacemaker.

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Fri, Apr 30, 2010 at 4:43 PM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: Hello, on rh el 5.5 trying to configure ocfs2 1.4 with pacemaker 1.0.8. It seems I have some problems with programs/kernel modules missing. I downloaded rpm for pacemaker from clusterlabs repo and rpm for ocfs2

Re: [Linux-HA] MySQL and 4 instances

2010-05-03 Thread Andrew Beekhof
On Thu, Apr 29, 2010 at 7:37 PM, mike mgbut...@nbnet.nb.ca wrote: Hello all, We had a simple 2 node MySQL cluster - nothing special. One instance that worked perfectly. We recently added 3 instances and now we're having some issues. The problem is that Heartbeat issues a MySQL Status

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Mon, May 3, 2010 at 10:23 AM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Mon, May 3, 2010 at 9:22 AM, Andrew Beekhof and...@beekhof.net wrote: [snip] You would need to rebuild ocfs2-tools with pacemaker support turned on. Hmm, thaks for answering. I did have the same idea

Re: [Linux-HA] o2cb pacemaker agent and

2010-05-03 Thread Andrew Beekhof
On Mon, May 3, 2010 at 10:57 AM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Mon, May 3, 2010 at 10:51 AM, Andrew Beekhof and...@beekhof.net wrote: On Mon, May 3, 2010 at 10:23 AM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Mon, May 3, 2010 at 9:22 AM, Andrew Beekhof

Re: [Linux-HA] Corosync shutdown hangs server

2010-04-30 Thread Andrew Beekhof
On Thu, Apr 29, 2010 at 8:21 PM, Brodie, Kent bro...@mcw.edu wrote: Hi-- I'm playing with corosync/pacemaker in a 2-node setup (using virtual machines..).   For the most part, I'm very impressed and it's all very cool.  A big leap from 'heartbeat', that's for sure :-) I have cluster-ip

Re: [Linux-HA] Pingd failed

2010-04-29 Thread Andrew Beekhof
On Mon, Apr 26, 2010 at 8:56 AM, Scheffler Heinz heinz.scheff...@psi.ch wrote: Hello I configured pingd as a clone resource. Now pingd crashed and the cluster did a failover. Can you show us the configuration of you pingd resource? Real pingd messages with a failed network resource looks

Re: [Linux-HA] Setup cluster

2010-04-27 Thread Andrew Beekhof
On Tue, Apr 27, 2010 at 5:37 PM, Gianluca Cecchi gianluca.cec...@gmail.com wrote: On Tue, Apr 27, 2010 at 1:14 PM, Dejan Muhamedagic deja...@fastmail.fmwrote: [snip] No, the advised values come from the resource agent's metadata. Those are the _minimums_ (at least so judged by the author of

Re: [Linux-HA] Problem with LRM?

2010-04-26 Thread Andrew Beekhof
On Mon, Apr 26, 2010 at 10:07 AM, RaSca ra...@miamammausalinux.org wrote: Il giorno Lun 26 Apr 2010 09:19:43 CET, Alessandra Giovanardi ha scritto: Hi, I have a cluster with 2 nodes (with SUSE SLES 10 SP2 OS). [...] Why my resource goes up only after this operation? I attach my cib.xml.

Re: [Linux-HA] [Linux-ha-dev] Deprecated resource agents

2010-04-21 Thread Andrew Beekhof
On Tue, Apr 20, 2010 at 3:23 PM, Lars Marowsky-Bree l...@novell.com wrote: On 2010-04-19T23:04:42, Lars Ellenberg lars.ellenb...@linbit.com wrote: Switching the ra type is, after all, another of those changes that require a full restart of the resource (and thus service down-time).

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-21 Thread Andrew Beekhof
On Tue, Apr 20, 2010 at 5:43 PM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Anyway, I'm not so sure of the evolution under SUSE of this software:  SUSE will include into SUSE Linux Enterprise Server 10 SP3 (x86_64) or 11 (futher releases) also pacemaker (to replace heartbeat) or not?

Re: [Linux-ha-dev] Deprecated resource agents

2010-04-20 Thread Andrew Beekhof
On Tue, Apr 20, 2010 at 8:15 AM, Florian Haas florian.h...@linbit.com wrote: On 04/20/2010 07:03 AM, Tim Serong wrote: On 4/20/2010 at 06:48 AM, Lars Marowsky-Bree l...@novell.com wrote: In general, I think the ability to depreciate functionality is needed, but shouldn't be slip-streamed into

Re: [Linux-ha-dev] Deprecated resource agents

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 19, 2010 at 2:05 PM, Florian Haas florian.h...@linbit.com wrote: Hello, in case you haven't yet noticed: as of resource-agents 1.0.2, several Linux-HA resource agents are marked as deprecated: - EvmsSCC and - Evmsd (both apply to EVMS, which is no longer maintained); -

Re: [Linux-HA] [Linux-ha-dev] Deprecated resource agents

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 19, 2010 at 2:05 PM, Florian Haas florian.h...@linbit.com wrote: Hello, in case you haven't yet noticed: as of resource-agents 1.0.2, several Linux-HA resource agents are marked as deprecated: - EvmsSCC and - Evmsd (both apply to EVMS, which is no longer maintained); -

Re: [Linux-HA] oracle restart

2010-04-19 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 2:38 PM, Edi BELIC e...@office.velenje.add.si wrote: Hi - We have two nodes - sles5 and sles6 . My oracle LSB resource is running on node sles5. There are constraint configured INFINITY on node sles5  for  group_ora1. When I reboot, or restart heartbeat service on node

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-19 Thread Andrew Beekhof
On Mon, Apr 19, 2010 at 12:15 PM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Andrew Beekhof wrote: On Fri, Apr 16, 2010 at 5:51 PM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Some times ago I performed the same operation via GUI on the same cluster without problem. I don't

Re: [Linux-ha-dev] proposed fix for the ABI extension of cluster-glue

2010-04-17 Thread Andrew Beekhof
On Sat, Apr 17, 2010 at 11:56 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Sat, Apr 17, 2010 at 11:40:36AM +0200, Lars Marowsky-Bree wrote: Lars, I have no other way of saying this, but I still think you're completely misguided in this desire to preserve binary compatibility. What's

Re: [Linux-ha-dev] proposed fix for the ABI extension of cluster-glue

2010-04-17 Thread Andrew Beekhof
On Sat, Apr 17, 2010 at 7:58 PM, Andrew Beekhof and...@beekhof.net wrote: On Sat, Apr 17, 2010 at 11:56 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Sat, Apr 17, 2010 at 11:40:36AM +0200, Lars Marowsky-Bree wrote: Lars, I have no other way of saying this, but I still think you're

Re: [Linux-ha-dev] proposed fix for the ABI extension of cluster-glue

2010-04-17 Thread Andrew Beekhof
On Sat, Apr 17, 2010 at 8:13 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Sat, Apr 17, 2010 at 07:58:38PM +0200, Andrew Beekhof wrote: I vote for reapplying the patch, bumping the SO name and forgetting about the whole thing. The only thing I do is move the two new members

Re: [Linux-ha-dev] proposed fix for the ABI extension of cluster-glue

2010-04-17 Thread Andrew Beekhof
Sorry, pressed send too quickly... On Sat, Apr 17, 2010 at 8:13 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Sat, Apr 17, 2010 at 07:58:38PM +0200, Andrew Beekhof wrote: I vote for reapplying the patch, bumping the SO name and forgetting about the whole thing. The only thing I do

Re: [Linux-HA] On RHEL5 / new rpms pacemaker-1.0.8-4.el5 +corosync-1.2.1-1.el5 fail

2010-04-16 Thread Andrew Beekhof
At a guess, I think this might be related to the auto-nodeid code. If you set a fixed value in corosync.conf, this possibly wouldn't happen. On Fri, Apr 16, 2010 at 10:21 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi, Sorry but this was not due to new releases, but only to the fact that

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-16 Thread Andrew Beekhof
On Fri, Apr 16, 2010 at 9:49 AM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Hi, I'm using  heartbeat-2.1.4-0.16.2 on a SUSE Linux Enterprise Server 10 SP3 (x86_64). My cluster is composed by two nodes: mdm01-mdm02, with two Resource Group: mdm01:~ # crm_mon -1 Last

Re: [Linux-HA] Element instance_attributes content does not follow the DTD, expecting (rule* , attributes), got (nvpair)

2010-04-16 Thread Andrew Beekhof
On Fri, Apr 16, 2010 at 5:51 PM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Andrew Beekhof wrote: On Fri, Apr 16, 2010 at 9:49 AM, Alessandra Giovanardi a.giovana...@cineca.it wrote: Hi, I'm using  heartbeat-2.1.4-0.16.2 on a SUSE Linux Enterprise Server 10 SP3 (x86_64). My

Re: [Linux-ha-dev] [Pacemaker] Announcement: new releases for cluster-glue (1.0.4), resource-agents (1.0.3), and heartbeat (3.0.3)

2010-04-15 Thread Andrew Beekhof
On Wed, Apr 14, 2010 at 4:01 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hello, The new releases of cluster glue (1.0.4), resource agents (1.0.3), and Heartbeat (3.0.3) are finally ready. Nice. The repos up on clusterabs.org are being rebuilt with all three now and should be ready in an

Re: [Linux-HA] NFS cluster based on Centos 5.4

2010-04-15 Thread Andrew Beekhof
On Thu, Apr 15, 2010 at 9:56 AM, Davide D'Amico davide.dam...@contactlab.com wrote: Could my problems related to centos init.d scripts (I could try using crm node offline nfs01.local in S00cpuspeed /etc/rc3.d/init.d script)? I modified /etc/init.d/openais adding: /usr/sbin/crm node standby

Re: [Linux-HA] [Pacemaker] Announcement: new releases for cluster-glue (1.0.4), resource-agents (1.0.3), and heartbeat (3.0.3)

2010-04-15 Thread Andrew Beekhof
On Wed, Apr 14, 2010 at 4:01 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hello, The new releases of cluster glue (1.0.4), resource agents (1.0.3), and Heartbeat (3.0.3) are finally ready. Nice. The repos up on clusterabs.org are being rebuilt with all three now and should be ready in an

Re: [Linux-HA] How to get all defects values

2010-04-13 Thread Andrew Beekhof
On Mon, Apr 12, 2010 at 12:41 PM, Alain.Moulle alain.mou...@bull.net wrote: Because I need that at the very first time I do //etc/init.d/corosync start/ the time to be available on only one node of the HA cluster_ is reduced to minimum_, and to get this, I need to have values of 5s for both

Re: [Linux-HA] How to get all defects values

2010-04-12 Thread Andrew Beekhof
On Mon, Apr 12, 2010 at 10:37 AM, Alain.Moulle alain.mou...@bull.net wrote:  Hi, Thanks for your response Andrew. Further questions : I would like to change /*cluster-delay*/ and /*dc-deadtime*/ default values in sources, why? I found cluster-delay default value in common.c in pacemaker

Re: [Linux-HA] mysql monitor giving up after some time

2010-04-09 Thread Andrew Beekhof
Configuration? On Fri, Apr 9, 2010 at 9:18 AM, Zausel zau...@deltaknoten.de wrote: Hi, I updated my pacemaker on my SLES 11 system. after that my op monitor for the ocf:heartbeat:mysql resource works only at the first time. after some minutes the monitor dosen't check the daemon anymore.

Re: [Linux-HA] How to get all defects values

2010-04-09 Thread Andrew Beekhof
pengine metadata crmd metadata or in 1.1, man pengine man crmd On Fri, Apr 9, 2010 at 11:26 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi, I can't find it but I wonder if there is a command to get all defects values for attributes, properties etc.  because I think that with crm or

Re: [Linux-HA] mysql monitor giving up after some time

2010-04-09 Thread Andrew Beekhof
-enabled=false \        cluster-infrastructure=openais \        last-lrm-refresh=1270796478 Am 09.04.2010 um 10:22 schrieb Andrew Beekhof: Configuration? On Fri, Apr 9, 2010 at 9:18 AM, Zausel zau...@deltaknoten.de wrote: Hi, I updated my pacemaker on my SLES 11 system. after that my op

Re: [Linux-HA] Clarify Apache failover please?

2010-04-09 Thread Andrew Beekhof
On Thu, Apr 8, 2010 at 9:43 PM, Simpson, John R john_simp...@reyrey.com wrote: I believe what you're looking for is migration-threshold. In the following Pacemaker snippet if Apache is stopped, if the website http://localhost/index.html doesn't respond, or if the HTML body doesn't contain

Re: [Linux-HA] trouble with CRM/XEN

2010-04-08 Thread Andrew Beekhof
On Wed, Apr 7, 2010 at 6:17 PM, Greg Woods wo...@ucar.edu wrote: On Wed, 2010-04-07 at 15:39 +0200, Andrew Beekhof wrote:  I increased the timeout even further (to 120s instead of the minimum recommended 60) and it seems to be working. Curious though, because when it does work, the logs

Re: [Linux-HA] Pb with rpms for epel-5/x86_64

2010-04-08 Thread Andrew Beekhof
On Thu, Apr 8, 2010 at 3:17 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi Dejan, I've done the patch manually in : /usr/lib/python2.4/site-packages/crm/cibconfig.py : ...        if self.obj_type == property:            l = get_pe_property_list() + get_crmd_property_list()            l

Re: [Linux-HA] Why does mysld start run again?

2010-04-07 Thread Andrew Beekhof
On Tue, Mar 30, 2010 at 8:43 PM, mike mgbut...@nbnet.nb.ca wrote: I can see where I have a class of lsb mysql in my cib.xml file. How would I change this to ocf? The easiest way is to just delete the lsb resource and add it back as an ocf one. Sorry but I'm new to this and while I have

Re: [Linux-HA] Pb with rpms for epel-5/x86_64

2010-04-07 Thread Andrew Beekhof
On Tue, Mar 30, 2010 at 2:59 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: There are still previous releases 1.0.8-1 at http://clusterlabs.org/rpm/epel-5 But that should have the same hg version as 1.0.8-2. Did I screw something up? ___ Linux-HA

Re: [Linux-HA] corosync status : something weird

2010-04-07 Thread Andrew Beekhof
You're asking on the wrong list (try the openais list) but yes, the init script in 1.2.0 needed some work. clusterlabs will be updated with 1.2.1 in the coming days. On Wed, Apr 7, 2010 at 1:53 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi I've had some trouble about the fiabiliy of the

Re: [Linux-HA] Pacemake GUI compile problems

2010-04-07 Thread Andrew Beekhof
The latest GUI code only builds for pacemaker 1.1 and higher. Either try that or grab an older version of the GUI. On Mon, Apr 5, 2010 at 11:00 PM, mike mgbut...@nbnet.nb.ca wrote: After 3 or 4 runs with different errors, I was able to install a few things that the ConfigureMe script required.

Re: [Linux-HA] trouble with CRM/XEN

2010-04-07 Thread Andrew Beekhof
On Mon, Apr 5, 2010 at 6:30 PM, Greg Woods wo...@ucar.edu wrote: On Sat, 2010-04-03 at 22:45 +0200, Dejan Muhamedagic wrote: I spoke too soon; now I am getting failures when stopping the Xen resources manually as well. I can't get both nodes online at the same time unless I disable

Re: [Linux-HA] Pb with rpms for epel-5/x86_64

2010-04-07 Thread Andrew Beekhof
On Wed, Apr 7, 2010 at 5:26 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Wed, Apr 07, 2010 at 01:43:42PM +0200, Andrew Beekhof wrote: On Tue, Mar 30, 2010 at 2:59 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: There are still previous releases 1.0.8-1 at http://clusterlabs.org

Re: [Linux-HA] Debian, system reboot problem

2010-03-29 Thread Andrew Beekhof
If you're using debian on i386, you'll have to rebuild your own debian packages. http://www.clusterlabs.org/wiki/Install#Building_from_Source On Fri, Mar 26, 2010 at 11:48 AM, artur.k a.kamin...@o2.pl wrote: I have a problem with the pacemaker Version:

Re: [Linux-HA] Problem with partition WITHOUT quorum

2010-03-29 Thread Andrew Beekhof
On Thu, Mar 25, 2010 at 8:00 PM, Michael Schwartzkopff mi...@multinet.de wrote: Am Donnerstag, 25. März 2010 15:19:55 schrieb RaSca: Hi all, I've got a very simple setup of two virtual machines (with Virtualbox) configured with  Debian Lenny, Corosync 1.2.0-1 and Pacemaker 1.0.7+hg20100203-1.

Re: [Linux-HA] Problem with partition WITHOUT quorum

2010-03-29 Thread Andrew Beekhof
On Mon, Mar 29, 2010 at 1:56 PM, Michael Schwartzkopff mi...@multinet.de wrote: Am Montag, 29. März 2010 13:50:57 schrieb Andrew Beekhof: On Thu, Mar 25, 2010 at 8:00 PM, Michael Schwartzkopff mi...@multinet.de wrote: Am Donnerstag, 25. März 2010 15:19:55 schrieb RaSca: Hi all, I've got

Re: [Linux-HA] Drbd/Pacemaker with ocfs2 drive

2010-03-25 Thread Andrew Beekhof
On Thu, Mar 25, 2010 at 3:33 AM, Tim Serong tser...@novell.com wrote: On 3/24/2010 at 09:45 PM, Frank Lazzarini flazzar...@gmail.com wrote: Hi there, I am trying to set up a little 2 node cluster with DRBD pacemaker-heartbeat which will be using the ocfs2 Filesystem. I want to use ocfs as a

Re: [Linux-HA] pingd - rules - problem

2010-03-24 Thread Andrew Beekhof
Expressions like these: expression attribute=dbma04_gateway_reachable id=xping-resource_dbma04s_ip-03-normal-state-rule-condition1 operation=lt value=100/ Should include type=integer so that the cluster does the correct type of comparison. Try that and let us know if it improves things. On Wed,

Re: [Linux-HA] resources should remain running on active node, when ping nodes unaccessible

2010-03-24 Thread Andrew Beekhof
Example 9.5: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ch09s03s03s02.html#id1893002 On Wed, Mar 24, 2010 at 11:05 AM, Muhammad Sharfuddin m.sharfud...@nds.com.pk wrote: Hi, two ping nodes two Cluster nodes(Active/Passive) resources stops when both ping

Re: [Linux-HA] pingd - rules - problem

2010-03-24 Thread Andrew Beekhof
On Wed, Mar 24, 2010 at 2:11 PM, Scheffler Heinz heinz.scheff...@psi.ch wrote: My DTD requires type=number. Anyway, it is a cosmetically correction - the behavior of the cluster is the same. Number seems to be the default. The rules are working correct 90% works good. There is a timing

Re: [Linux-HA] pingd - rules - problem

2010-03-24 Thread Andrew Beekhof
Of Andrew Beekhof Sent: Mittwoch, 24. März 2010 11:54 To: General Linux-HA mailing list Subject: Re: [Linux-HA] pingd - rules - problem Expressions like these: expression attribute=dbma04_gateway_reachable id=xping-resource_dbma04s_ip-03-normal-state-rule-condition1 operation=lt value=100

Re: [Linux-HA] Link to CIB User guide

2010-03-23 Thread Andrew Beekhof
You want configuration explained http://www.clusterlabs.org/wiki/Documentation#Reference_Material On Tue, Mar 23, 2010 at 3:34 PM, mike mgbut...@nbnet.nb.ca wrote: Hello all, I'm new to the LinuxHA world so be patient with me :--) I'm trying to find a document that will help me understand

Re: [Linux-HA] Efficient Resource Colocation Constraints

2010-03-23 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 6:01 PM, Eric Blau ebl...@gmail.com wrote: Hi everyone, I'm working with a test configuration containing 128 resources using the Stateful example resource agent supplied with Linux HA.  I'm trying to figure out how to get resource colocation constraints working

Re: [Linux-HA] Efficient Resource Colocation Constraints

2010-03-23 Thread Andrew Beekhof
On Tue, Mar 23, 2010 at 7:11 PM, Eric Blau ebl...@gmail.com wrote: On Tue, Mar 23, 2010 at 13:17, Andrew Beekhof and...@beekhof.net wrote: On Tue, Mar 23, 2010 at 6:01 PM, Eric Blau ebl...@gmail.com wrote: Hi everyone, I'm working with a test configuration containing 128 resources using

Re: [Linux-HA] Usage of Cluster-Testsuite

2010-03-22 Thread Andrew Beekhof
/SLE_11/x86_64/ Greetings Jochen Andrew Beekhof schrieb: On Thu, Mar 18, 2010 at 4:40 PM, Andreas Mockandreas.m...@web.de  wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhofand...@beekhof.net Gesendet: 18.03.2010 15:32:44 An: General Linux-HA mailing listlinux-ha@lists.linux-ha.org

Re: [Linux-HA] Master/Slave OCF script

2010-03-22 Thread Andrew Beekhof
On Mon, Mar 22, 2010 at 12:50 PM, Maciej Lotkowski maciej.lotkow...@gmail.com wrote: On Fri, Mar 19, 2010 at 5:16 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Fri, Mar 19, 2010 at 03:33:09PM +0100, Maciej Lotkowski wrote: Hi, I'm trying to write OCF script for Redis

Re: [Linux-HA] configuring monitor action in OCF script needed??

2010-03-19 Thread Andrew Beekhof
On Fri, Mar 19, 2010 at 7:20 AM, lakshmipadmaja maddali lakshmipadmaj...@gmail.com wrote: Hi All,         Can we have an OCF script without configuring monitor action. no If yes, then what action will heartbeat calls first. Please help. Thanks, lakshmi

Re: [Linux-HA] Usage of Cluster-Testsuite

2010-03-19 Thread Andrew Beekhof
On Thu, Mar 18, 2010 at 4:40 PM, Andreas Mock andreas.m...@web.de wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 18.03.2010 15:32:44 An: General Linux-HA mailing list linux-ha@lists.linux-ha.org Betreff: Re: [Linux-HA] Usage of Cluster-Testsuite

Re: [Linux-HA] Usage of Cluster-Testsuite

2010-03-18 Thread Andrew Beekhof
On Wed, Mar 17, 2010 at 12:02 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, I've now pacemaker/corosync running on two machines with openSuSE 11.2, stonith agents are configured. So the base for a cluster is up and running. There were some postings regarding this combination having

Re: [Linux-HA] OCF-RA and shell functions

2010-03-18 Thread Andrew Beekhof
On Wed, Mar 17, 2010 at 11:35 AM, Marian Marinov m...@yuhu.biz wrote: On Wednesday 17 March 2010 12:16:58 Andreas Mock wrote: Hi all, here some questions regarding programming ocf-ra: a) Am I right that programming a RA as portable shell script (no bashisms)  is preferred? Or are other

Re: [Linux-HA] Virtual Name and Samba clarification.

2010-03-18 Thread Andrew Beekhof
On Thu, Mar 18, 2010 at 11:33 AM, Tim Serong tser...@novell.com wrote:  Also, someone really needs to publish some documentation on effective use of Samba with Linux-HA/Pacemaker clusters (having written this email, I have a sinking feeling I will be volunteered for the task). What an

Re: [Linux-HA] Usage of Cluster-Testsuite

2010-03-18 Thread Andrew Beekhof
On Thu, Mar 18, 2010 at 1:16 PM, Andreas Mock andreas.m...@web.de wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 18.03.2010 09:48:23 An: General Linux-HA mailing list linux-ha@lists.linux-ha.org Betreff: Re: [Linux-HA] Usage of Cluster-Testsuite Its

Re: [Linux-HA] Usage of Cluster-Testsuite

2010-03-18 Thread Andrew Beekhof
On Thu, Mar 18, 2010 at 3:03 PM, Andreas Mock andreas.m...@web.de wrote: -Ursprüngliche Nachricht- Von: Andrew Beekhof and...@beekhof.net Gesendet: 18.03.2010 09:48:23 An: General Linux-HA mailing list linux-ha@lists.linux-ha.org Betreff: Re: [Linux-HA] Usage of Cluster-Testsuite Its

Re: [Linux-HA] Question about VirtualDomain

2010-03-18 Thread Andrew Beekhof
On Thu, Mar 18, 2010 at 3:08 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi I think you're right Florian. So I've workarounded the problem because I can't see where to fix it as long as virsh start vmname returns immediately wheres the vm is only starting but not yet started (not yet

Re: [Linux-HA] node2 wont stay up

2010-03-17 Thread Andrew Beekhof
I wonder if this might be related: Mar 17 21:46:50 node2 heartbeat: [5289]: ERROR: glib: Error binding socket (Address already in use). Retrying. On Wed, Mar 17, 2010 at 9:44 PM, Cameron Smith velvetpi...@gmail.com wrote: Here is more info: In checking /var/log/messages: Mar 17 21:46:50

Re: [Linux-HA] Resource colocation with a clone.

2010-03-17 Thread Andrew Beekhof
On Wed, Mar 17, 2010 at 6:44 PM, Michele Codutti michele.codu...@uniud.it wrote: Hi all, It's possible to constrain a resource to run only on nodes where also runs an instance of a patricular clone? Example: I've a database-like application that i've setup as a clone to run one instance for

[Linux-ha-dev] Purpose of HA_LOGD in .ocf-shellfuncs?

2010-03-16 Thread Andrew Beekhof
Can anyone explain the purpose of this block? if [ x${HA_LOGD} = xyes ] ; then ha_logger -t ${HA_LOGTAG} $@ if [ $? -eq 0 ] ; then return 0 fi fi I ask because I cant find anything that actually sets HA_LOGD

Re: [Linux-HA] Question about VirtualDomain

2010-03-16 Thread Andrew Beekhof
On Tue, Mar 16, 2010 at 11:14 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi I've a question about the start of a VM with VirtualDomain , knowing that I have configured this : crm configure primitive vm15  ocf:heartbeat:VirtualDomain  params config=/root/vms/vm15.xml \  

Re: [Linux-HA] Question about VirtualDomain

2010-03-16 Thread Andrew Beekhof
On Tue, Mar 16, 2010 at 11:25 AM, Michael Schwartzkopff mi...@multinet.de wrote: Am Dienstag, 16. März 2010 11:14:57 schrieb Alain.Moulle: Hi I've a question about the start of a VM with VirtualDomain , knowing that I have configured this : crm configure primitive vm15  

Re: [Linux-HA] Changing CIB Group Properties from the Command Line

2010-03-16 Thread Andrew Beekhof
On Tue, Mar 16, 2010 at 5:27 PM, Robinson, Eric eric.robin...@psmnv.com wrote: You could do something like this: # echo group new-members-list | crm configure load update - When you change or reorder group memberships like that (using either crm configure or cibadmin) does it interrupt

Re: [Linux-HA] A Shadow Instance Already Exists

2010-03-15 Thread Andrew Beekhof
On Sun, Mar 14, 2010 at 1:15 PM, Robinson, Eric eric.robin...@psmnv.com wrote: probably you wanted to create a resource. You can do this with the crm configure command. The crm new command creates new shadow configuration, which is an environment to play with besides the running

Re: [Linux-HA] Better Getting Started Document?

2010-03-15 Thread Andrew Beekhof
On Sat, Mar 13, 2010 at 6:46 PM, Robinson, Eric eric.robin...@psmnv.com wrote: Is there a better document for getting a noob started with Pacemaker+Corosync? I've been going through the Cluster from Scratch Check out the new version which is for openais. document, and starting on page xv,

Re: [Linux-HA] Changing CIB Group Properties from the Command Line

2010-03-15 Thread Andrew Beekhof
On Mon, Mar 15, 2010 at 5:09 PM, Robinson, Eric eric.robin...@psmnv.com wrote: I currently have a group that looks like this: group Group1 FileSystem ClusterIP MySQL_001 MySQL_002 This would probably work: crm configure edit Group1 I can add a new primitive to the CIB from the command

Re: [Linux-HA] Giant Heartbeat Packets?

2010-03-12 Thread Andrew Beekhof
On Thu, Mar 11, 2010 at 9:18 PM, Robinson, Eric eric.robin...@psmnv.com wrote: I have four heartbeat 2-node clusters on the same VLAN. Three of them are configured to broadcast heartbeat information (they'll be changed to unicast soon). Two of the clusters are communicating with each other

Re: [Linux-HA] Corosync conflicts with Openais?

2010-03-10 Thread Andrew Beekhof
On Wed, Mar 10, 2010 at 8:49 AM, Tim Serong tser...@novell.com wrote: On 3/10/2010 at 04:56 PM, Robinson, Eric eric.robin...@psmnv.com wrote: Error: corosync conflicts with openais This'll be due to the openais/corosync split[1].  You can either just not install openais at all if you don't

Re: [Linux-HA] Heartbeat 2 and start process timeout

2010-03-10 Thread Andrew Beekhof
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-operation-defaults.html#s-operation-timeouts On Tue, Mar 9, 2010 at 4:35 PM, Carlos Eduardo Chiriboga Calderon cchirib...@palosanto.com wrote: Hi everybody, I have a serious problem with my cluster: Sometimes, the

Re: [Linux-HA] Corosync conflicts with Openais?

2010-03-10 Thread Andrew Beekhof
On Wed, Mar 10, 2010 at 3:04 PM, Robinson, Eric eric.robin...@psmnv.com wrote: (guessing) the latest pacemaker package specifies corosync as a dependency so effectively this command tries to install both? Right. You can drop the openais = 0.80.6 part I totally believe you, but when I do 'yum

Re: [Linux-HA] Corosync conflicts with Openais?

2010-03-10 Thread Andrew Beekhof
On Wed, Mar 10, 2010 at 5:34 PM, Robinson, Eric eric.robin...@psmnv.com wrote: That is pretty confusing to me because (1) people seem to say that OpenAIS 0.80.6 (whitetank) is the way to go because the alternative (wilson+flatiron) is not ready for prime time and also contains features that

Re: [Linux-HA] Corosync conflicts with Openais?

2010-03-10 Thread Andrew Beekhof
On Thu, Mar 11, 2010 at 12:51 AM, Robinson, Eric eric.robin...@psmnv.com wrote:       yum install -y openais = 0.80.6 pacemaker You can drop the openais = 0.80.6 part Then should I also ignore pages xv-xvi of the Cluster from Scratch document, where it talks about configuring OpenAIS, and

Re: [Linux-HA] on fail problem

2010-03-10 Thread Andrew Beekhof
Your stop start and, in particular, status actions are completely broken. The script is not LSB compliant and cannot be used by Pacemaker. On Wed, Mar 10, 2010 at 7:37 PM, Artur a.kamin...@o2.pl wrote: I have a service 'something' that is running on the server first, and suddenly stops working

Re: [Linux-HA] Active/Active Cluster Master/Slave question

2010-03-08 Thread Andrew Beekhof
On Sun, Mar 7, 2010 at 6:14 PM, Marc-Christian Petersen m@gmx.de wrote: Hello all, first: I'm completely new to pacemaker. I want to setup a scenario like this: Machine A)        - has IP address 10.0.0.251        - DRBD resources                drbd1 (for apache2)                

Re: [Linux-HA] on fail problem

2010-03-05 Thread Andrew Beekhof
Make sure your script is lsb compliant: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html A status action is definitely going to cause problems. On Fri, Mar 5, 2010 at 2:58 PM, artur.k a.kamin...@o2.pl wrote: I have the services 'redis-master'

Re: [Linux-HA] Command line option to fail back a master/slave resource

2010-03-01 Thread Andrew Beekhof
On Thu, Feb 25, 2010 at 4:40 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, Feb 24, 2010 at 10:45:16AM -0800, Bob Schatz wrote: QUESTION #2:  Why would these operations abort?  They are serial commands? We have to ask Andrew about that. What kind of abort do you mean? Do you

Re: [Linux-HA] ptest graph generation

2010-02-27 Thread Andrew Beekhof
On Fri, Feb 26, 2010 at 3:25 PM, bi...@antworte.me wrote: Hi, which tasks are to performed to get a graph or xml with ptest? On which node does it have to be performed? I run pacemaker 1.0.7. ptest -L -D cl.dot -G cl.xml creates cl.dot +++  digraph g { }

Re: [Linux-HA] Question about 'group' in Pacemaker

2010-02-24 Thread Andrew Beekhof
On Wed, Feb 24, 2010 at 9:33 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi Andrew, I meant stopped by target-role : crm resource respingal2 stop Then the behavior is expected, since there is no cluster node it is allowed to run on. And if it can't run anywhere, why bother moving the rest of

Re: [Linux-HA] Resource location constraints question

2010-02-23 Thread Andrew Beekhof
On Mon, Feb 15, 2010 at 8:49 PM, Eric Blau ebl...@gmail.com wrote: Hello all, I have some questions about resource location and colocation constraints. I'm trying to set up a proof of concept configuration with some multistate resources using the Stateful RA.  I'm currently using Linux HA

Re: [Linux-HA] Resource location constraints question

2010-02-23 Thread Andrew Beekhof
On Tue, Feb 23, 2010 at 3:43 PM, Eric Blau ebl...@gmail.com wrote: 3.       If a node goes offline and comes back, the CRM does not redistribute resources to that server, despite setting resource_stickiness to 0 on all of the resources. Where is back? And why would it be better? If each

Re: [Linux-HA] Question about group in Pacemaker

2010-02-23 Thread Andrew Beekhof
On Tue, Feb 23, 2010 at 3:32 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi Andrew ok that was not a good choice to take smard as test init script to demonstrate the problem about group ... sorry , so I did some simple scripts /pingalain1/ (resource respingal1) and /pingalain2/ (resource

Re: [Linux-HA] Setup cluster

2010-02-22 Thread Andrew Beekhof
...@lists.linux-ha.org] On Behalf Of Andrew Beekhof Sent: Friday, February 19, 2010 8:18 AM To: General Linux-HA mailing list Subject: Re: [Linux-HA] Setup cluster On Thu, Feb 18, 2010 at 11:51 PM, Ruiyuan Jiang ruiyuan_ji...@liz.com wrote: Thanks, Andreas. That is what I suspected too. Once stonith

Re: [Linux-HA] Question about group in Pacemaker

2010-02-22 Thread Andrew Beekhof
smartd2 can't run on node1 because it returned rc=5 for the monitor op. EXECRA_NOT_INSTALLED = 5, On Mon, Feb 22, 2010 at 3:34 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi Andrew, sorry for the delay, but about my problem on groups I've reproduced  it with only two resources

Re: [Linux-HA] Setup cluster

2010-02-19 Thread Andrew Beekhof
On Thu, Feb 18, 2010 at 11:51 PM, Ruiyuan Jiang ruiyuan_ji...@liz.com wrote: Thanks, Andreas. That is what I suspected too. Once stonith disabled, the cluster starts. I have not tried to set quorum yet. I will try next. Now I have another problem. Apache does not start but virtual IP address

Re: [Linux-HA] Question on Pacemaker/openais behavior after network split

2010-02-19 Thread Andrew Beekhof
On Thu, Feb 18, 2010 at 12:43 PM, virgil chereches virgil.cherec...@orange.ro wrote: I would like to check with you what is the expected behavior of a two-nodes Pacemaker cluster with fencing/stonith enabled and stonith-action=reboot in the following scenario: 1. network communication

Re: [Linux-HA] Question about group in Pacemaker

2010-02-17 Thread Andrew Beekhof
On Wed, Feb 17, 2010 at 8:02 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi Andrew, the releases are those officially delivered with fc12 : pacemaker-1.0.5-4.fc12 and : cluster-glue-1.0-0.11.b79635605337.hg.fc12 corosync-1.1.2-1.fc12 heartbeat-3.0.0-0.5.0daab7da36a8.hg.fc12

Re: [Linux-HA] Simple 2 nodes Linux-HA scenario

2010-02-16 Thread Andrew Beekhof
On Tue, Feb 16, 2010 at 4:46 PM, fabio.anton...@kaskonetworks.it fabio.anton...@kaskonetworks.it wrote: Hi Andreew thanks a lot for your time. I have read the document you wrote and I have understood many things not so clear before. I have added a resource section within the cib.xml. The

Re: [Linux-HA] About hb_gui

2010-02-16 Thread Andrew Beekhof
On Tue, Feb 16, 2010 at 2:54 PM, Alain.Moulle alain.mou...@bull.net wrote: Hi I can't find anymore on whatever linux distribution on www.clusterlabs.org/rpm the pacemaker-mgmt which gave us the hb_gui ... Is it definitively removed from any distribution ? Or where is it hidden ? I've

Re: [Linux-HA] Restart a resourse if other is migrated

2010-02-16 Thread Andrew Beekhof
On Fri, Feb 5, 2010 at 11:04 AM, Marian Marinov m...@yuhu.biz wrote: Hello, sorry for the stupid question, but I can't seem to find a solution. I have an lsb resource which is a clone in my cluster. I want this lsb resource to be restarted each time heartbeat is moving certain resource (for

Re: [Linux-HA] Simple 2 nodes Linux-HA scenario

2010-02-15 Thread Andrew Beekhof
On Mon, Feb 15, 2010 at 1:06 PM, fabio.anton...@kaskonetworks.it fabio.anton...@kaskonetworks.it wrote: Hi all I'm a newbie of Linux-HA. My final target is to setup a simple 2 nodes cluster with a only one virtual IP address. I have a couple of PCs running heartbeat 2.1.4-2 (Ubuntu 9.0.4).

Re: [Linux-HA] pingd constraint

2010-02-15 Thread Andrew Beekhof
On Sun, Feb 14, 2010 at 10:24 PM, tomtom t...@tiri.li wrote: Hi all, if I use pingd and a host_list, how could a configuration to ensure a resource e.g. VIP runs on the host which reaches the most of host_list hosts?

Re: [Linux-HA] resource-agents??

2010-02-10 Thread Andrew Beekhof
On Tue, Feb 9, 2010 at 8:27 PM, Ilo Lorusso sneak...@gmail.com wrote:  hi ,  ive got a resource-agent that works 100% on one machine. If i run it through off-test I get the following output: /usr/sbin/ocf-tester -n post1 /usr/lib/ocf/resource.d/heartbeat/postfix; echo $?  Beginning tests

Re: [Linux-HA] How to start all resources in one shot ?

2010-02-10 Thread Andrew Beekhof
Version? 1.0 can do deletions with xpath, eg. cibadmin --delete-all --xpath '//@target-role=stopped' On Wed, Feb 10, 2010 at 11:55 AM, Alain.Moulle alain.mou...@bull.net wrote: Hi Suppose we have all resources with target-role=stopped, is there a way (via crm or otherwise) to set in one

Re: [Linux-HA] monitor multiple nodes using single node

2010-02-09 Thread Andrew Beekhof
On Tue, Feb 9, 2010 at 5:21 AM, Qwerty-1 umakantgoy...@gmail.com wrote: Hi, Thanks. can all the nodes send status to a single node. i am working on HA N+1 architecture where LoadBalancer will look for the status of all running N nodes.If any node fails then LoadBalancer should know it and

Re: [Linux-HA] Recovering from an unmanaged resource

2010-02-08 Thread Andrew Beekhof
On Thu, Feb 4, 2010 at 2:08 AM, Daryl Lang daryl.l...@gmail.com wrote: We have a resource that goes to unmanaged due to a stop timeout.  We plan to increase the stop timeout from 20 seconds to 60 seconds.  However, we would like understand the standard process to move an unmanaged resource back

<    4   5   6   7   8   9   10   11   12   13   >