Re: [Pacemaker] Odd problem with crm/Filesystem RA

2011-06-20 Thread Florian Haas
On 2011-06-20 07:45, Rob Thomas wrote: Executive overview: When bringing a node back from standby, to test failover, the Filesytem RA on the _slave_ node, which has just relinquished the resource, tries to mount the filesystem after it's handed it back to the master, fails, and leaves the

Re: [Pacemaker] CMAN Config

2011-06-20 Thread Andreas Kurz
On 2011-06-18 14:59, imnotpc wrote: How do you set corosync options when you start your cluster with CMAN, things like what port to use and logging options? It seems to ignore any settings in /etc/corosync/corosync.conf and I don't see any options in cluster.conf for these things. man

Re: [Pacemaker] preventing cluster doing any action when load reaches some threshold

2011-06-20 Thread Dejan Muhamedagic
Hi, On Sun, Jun 19, 2011 at 04:34:24PM +0200, Nikola Ciprich wrote: Hello Andrew et al, few times, it happened to me, that cluster node got loaded too much (especially I/O load), so cluster actions (monitor) started to timeout. So cluster manager decided to restart services etc, thus

Re: [Pacemaker] preventing cluster doing any action when load reaches some threshold

2011-06-20 Thread Nikola Ciprich
What do your monitor timeouts look like? You should probably adjust them. Take a look here for some notes about the problem: In the case I'm concerned now, the timeout of main resource upon which others did rely (DLM) was set to 120s this seems quite a lot to me, but now I'm thinking about it,

Re: [Pacemaker] Resource starting problem

2011-06-20 Thread Dejan Muhamedagic
Hi, On Wed, Jun 15, 2011 at 12:46:57PM +0200, Christian Roessner wrote: Hi, this is my first post on this list. I hope I put my question to the correct mailing-list. I have installed Pacemaker/Corosync on two Ubuntu-Lucid Servers building a two node cluster. This cluster shall become a

Re: [Pacemaker] CMAN Config

2011-06-20 Thread imnotpc
On Monday, June 20, 2011 03:45:54 Andreas Kurz wrote: On 2011-06-18 14:59, imnotpc wrote: How do you set corosync options when you start your cluster with CMAN, things like what port to use and logging options? It seems to ignore any settings in /etc/corosync/corosync.conf and I don't see

Re: [Pacemaker] CMAN Config

2011-06-20 Thread Andreas Kurz
On 2011-06-20 12:54, imnotpc wrote: On Monday, June 20, 2011 03:45:54 Andreas Kurz wrote: On 2011-06-18 14:59, imnotpc wrote: How do you set corosync options when you start your cluster with CMAN, things like what port to use and logging options? It seems to ignore any settings in

[Pacemaker] Resource Agent timeout

2011-06-20 Thread Kulovits Christian - OS ITSC
Hello List, When a resource agent times out a SIGTERM is issued when the timeout value has exceeded. When the resource agent will not terminate within the next 5 seconds a SIGKILL is issued. Is there a way to set this limit? May be to 30 secs or so? 5 seconds may often be insufficient for a

Re: [Pacemaker] Resource Agent timeout

2011-06-20 Thread Andreas Kurz
On 2011-06-20 14:28, Kulovits Christian - OS ITSC wrote: Hello List, When a resource agent times out a SIGTERM is issued when the timeout value has exceeded. When the resource agent will not terminate within the next 5 seconds a SIGKILL is issued. Is there a way to set this limit? May

Re: [Pacemaker] Resource Agent timeout

2011-06-20 Thread Kulovits Christian - OS ITSC
Andreas, you mean the cluster wide default timeout? I wonder if there is a possibility to set the fixed timeout of 5 secs when SIGKILL is issued after the SIGTERM when the resource timeout is exceeded. Regards, Christian -Original Message- From: Andreas Kurz

Re: [Pacemaker] Resource Agent timeout

2011-06-20 Thread Andreas Kurz
On 2011-06-20 15:15, Kulovits Christian - OS ITSC wrote: Andreas, you mean the cluster wide default timeout? I wonder if there is a possibility to set the fixed timeout of 5 secs when SIGKILL is issued after the SIGTERM when the resource timeout is exceeded. Ah ... sorry, misinterpreted

Re: [Pacemaker] Resource Agent timeout

2011-06-20 Thread Dejan Muhamedagic
Hi, On Mon, Jun 20, 2011 at 03:15:23PM +0200, Kulovits Christian - OS ITSC wrote: Andreas, you mean the cluster wide default timeout? I wonder if there is a possibility to set the fixed timeout of 5 secs when SIGKILL is issued after the SIGTERM when the resource timeout is exceeded. No,

[Pacemaker] Groups

2011-06-20 Thread Proskurin Kirill
Hello all! I`m new to pacemakers and have a small question. I want what my resource will be run on all nodes except some. For example we have 10 nodes: node1-10 I want it running on node1-5 but not on node5-10. I can make a 5 location with -INFINITY: node5 ; -INFINITY: node6 and so on. But

[Pacemaker] How to tell pacemaker to start exportfs after filesystem resource

2011-06-20 Thread Александр Малаев
Hello, I have configured pacemaker+ocfs2 cluster with shared storage connected by FC. Now I need to setup NFS export in Active/Active mode and I added all needed resources and wrote the order of starting. But then node is starting after reboot I got race condition between Filesystem resource and

Re: [Pacemaker] Location issue

2011-06-20 Thread Andrew Beekhof
On Fri, Jun 17, 2011 at 9:31 AM, ruslan usifov ruslan.usi...@gmail.com wrote: Andrew does any chance to fix this behaivour??? Now this constraint doesn't work: Define doesn't work? Not accepted by the shell? Allow's it to be started elsewhere? In the later case, please include a crm_report

Re: [Pacemaker] RHEL 6.1 STONITH configuration

2011-06-20 Thread Andrew Beekhof
On Fri, Jun 17, 2011 at 7:17 PM, Andreas Kurz andreas.k...@linbit.com wrote: On 2011-06-17 10:38, Pieter Baele wrote: I've had exactly the same problem with pacemaker on rhel 6 as decribed in RHEL 6.0 STONITH configuration (Jun 09, warp) I thought an upgrade to 6.1 would help (because *some*

Re: [Pacemaker] which version of pacemaker prefer to use

2011-06-20 Thread Andrew Beekhof
1.1.x On Thu, Jun 2, 2011 at 8:40 PM, ruslan usifov ruslan.usi...@gmail.com wrote: Hello I have one question which pacemaker version prefer to use 1.1 or 1.0 1.0 is marked as stable, but all documentation resources refer to version 1.1. I'm little bit confusion

Re: [Pacemaker] How to ensure that a resource is only running at one place?

2011-06-20 Thread Andrew Beekhof
On Wed, May 25, 2011 at 5:27 PM, Kevin Stevenard ksteven...@gmail.com wrote: Hi Mark, I totally agree with that, I was looking for a quick and simple solution to this problem. But indeed it makes no sense to check somewhere if a resource that should not run is running. lmb has been

Re: [Pacemaker] corosync-quorumtool configuration

2011-06-20 Thread Andrew Beekhof
I don't think this is legal: service { name: corosync_quorum ver: 0 name: pacemaker use_mgmtd: yes use_logd: yes } and even if it were, corosync's native quorum implementation (or our use of it) was a but buggy last time i

Re: [Pacemaker] Mail notification for fencing action

2011-06-20 Thread Andrew Beekhof
On Tue, Jun 14, 2011 at 5:30 AM, imnotpc imno...@rock3d.net wrote: I've created a group containing the primary RA and MailTo as the second resource. This works as exected and sends an e-mail when the primary resource stops or starts. I'd like to configure pacemaker to send an e-mail any time a

Re: [Pacemaker] Bug in compiling pacemaker-pygui: CRM_DAEMON_DIR not defined

2011-06-20 Thread Andrew Beekhof
On Tue, May 31, 2011 at 6:58 PM, Gao,Yan y...@novell.com wrote: On 05/31/11 04:13, Andrew Beekhof wrote: On Mon, May 30, 2011 at 2:23 PM, Gao,Yan y...@novell.com wrote: On 05/30/11 17:31, Andrew Beekhof wrote: It used to be in crm_config.h but I had to remove it because it interfered with

Re: [Pacemaker] crm_simulate examples...

2011-06-20 Thread Andrew Beekhof
On Fri, May 27, 2011 at 2:43 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Thu, May 26, 2011 at 10:22:25AM -0400, Rick Beldin wrote: Hi... I was wondering if anyone can share some examples of using crm_simulate. The documentation on this tool is quite skimpy, and seems limited to

Re: [Pacemaker] Resource Agent timeout

2011-06-20 Thread Kulovits Christian - OS ITSC
Hi Dejan, We have sybase at our shop, and the start of the Sybase server may last from 5 minutes to up to 45 minutes. I found a resource agent in the web who needs 3 timeout parameter passed to it, one for start, one for stop and one for monitor. And the cluster config itself has similar

Re: [Pacemaker] Bug in compiling pacemaker-pygui: CRM_DAEMON_DIR not defined

2011-06-20 Thread Gao,Yan
On 06/21/11 13:07, Andrew Beekhof wrote: On Tue, May 31, 2011 at 6:58 PM, Gao,Yan y...@novell.com wrote: On 05/31/11 04:13, Andrew Beekhof wrote: On Mon, May 30, 2011 at 2:23 PM, Gao,Yan y...@novell.com wrote: On 05/30/11 17:31, Andrew Beekhof wrote: It used to be in crm_config.h but I had to