Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Michael Schwartzkopff
Am Mittwoch, 24. April 2013, 08:35:29 schrieb Johan Huysmans: > I tried the failure-timeout. > But I noticed that when the failure-timeout resets the failcount the > resource becomes OK in the crm_mon view. > However the resource is still failing. > > This shouldn't happen, Can this behaviour be c

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Johan Huysmans
I tried the failure-timeout. But I noticed that when the failure-timeout resets the failcount the resource becomes OK in the crm_mon view. However the resource is still failing. This shouldn't happen, Can this behaviour be changed with some setting? gr. Johan On 24-04-13 07:23, Andrew Beekho

Re: [Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Andrew Beekhof
On 23/04/2013, at 11:24 PM, Johan Huysmans wrote: > Hi All, > > I have a cloned resource, running on my both nodes, my on-fail is set to > block. > So if the resource fails on a node the failcount increases, but whenever the > resource automatically recovers the failcount isn't reset. > > Is

Re: [Pacemaker] why so long to stonith?

2013-04-23 Thread Andrew Beekhof
On 24/04/2013, at 5:34 AM, Brian J. Murrell wrote: > Using pacemaker 1.1.8 on RHEL 6.4, I did a test where I just killed > (-KILL) corosync on a peer node. Pacemaker seemed to take a long time > to transition to stonithing it though after noticing it was AWOL: [snip] > As you can see, 3 minut

Re: [Pacemaker] why so long to stonith?

2013-04-23 Thread Digimer
As I understand it, this is a known issue with the 1.1.8 release. I believe that 1.1.9 is now available from the pacemaker repos and it should fix the problem. digimer On 04/23/2013 03:34 PM, Brian J. Murrell wrote: Using pacemaker 1.1.8 on RHEL 6.4, I did a test where I just killed (-KILL) c

[Pacemaker] Pacemaker core dumps

2013-04-23 Thread Xavier Lashmar
Hello everyone, Below is a message that my colleague tried to post to the mailing list, but somehow it didn't make it yet, so I'm re-posting just in case it got lost somewhere: Hi All, I currently have a pacemaker + drbd cluster. I am getting core dumps every 15 minutes. I can't seem to figur

[Pacemaker] why so long to stonith?

2013-04-23 Thread Brian J. Murrell
Using pacemaker 1.1.8 on RHEL 6.4, I did a test where I just killed (-KILL) corosync on a peer node. Pacemaker seemed to take a long time to transition to stonithing it though after noticing it was AWOL: Apr 23 19:05:20 node2 corosync[1324]: [TOTEM ] A processor failed, forming new configurati

Re: [Pacemaker] fail over just failed

2013-04-23 Thread Daniel Black
> >crit: get_timet_now: Defaulting to 'now' > > What does this message mean? > > It means some idiot (me) forgot to downgrade a debug message before > committing. > You can safely ignore it. critical important debug message ignored. Thanks for owning up to it. > Were there any other issues

[Pacemaker] clear failcount when monitor is successful?

2013-04-23 Thread Johan Huysmans
Hi All, I have a cloned resource, running on my both nodes, my on-fail is set to block. So if the resource fails on a node the failcount increases, but whenever the resource automatically recovers the failcount isn't reset. Is there a way to reset the failcount to 0, when the monitor is succe

Re: [Pacemaker] pacemaker monitoring user permision denied

2013-04-23 Thread Andrew Beekhof
On 23/04/2013, at 2:56 PM, Andreas Mock wrote: > Hi Andrew, > > is 1.1.10-rc1 a working title or can the package be found somewhere? Its currently just a tag. Grabbing the source tree and running "make TAG=Pacemaker-1.1.10-rc1 rpm" will give you packages. > > I saw that on http://clusterlab

Re: [Pacemaker] Enabling debugging with cman, corosync, pacemaker at runtime

2013-04-23 Thread Andrew Beekhof
On 23/04/2013, at 2:51 PM, Andreas Mock wrote: > Hi Andrew, > > thank you for that hint. no problem. the basic idea is to be quiet by default but allow access to extreme levels of verbosity should the need arise :) > > Best regards > Andreas Mock > > > -Ursprüngliche Nachricht- >

Re: [Pacemaker] iscsi target mounting readonly on client

2013-04-23 Thread Joseph-Andre Guaragna
Hi Felix, Sorry for the late answer. I gave a try to your suggestions, it worked immediately with block io and the increase block timeout. I stick with iet for now as my supervisor want to stick to standard pacakges. You really save my day. Best reagards, Joseph-André GUARAGNA 2013/4

Re: [Pacemaker] pcs equivalent of crm configure erase

2013-04-23 Thread Lars Marowsky-Bree
On 2013-04-21T09:57:02, Andrew Beekhof wrote: > Because the author wanted it removed from Pacemaker, at which point someone > would have needed to navigate the red tape to get it back into the > distribution. It is easier to include a new package in Fedora than to manage a package split? I had

Re: [Pacemaker] Routing-Ressources on a 2-Node-Cluster

2013-04-23 Thread T.
Hi Devin, thank you very much for your answer. > If you insist on trying to do this with just the Linux-HA cluster, > I don't have any suggestions as to how you should proceed. I know that the "construct" we are building is quite complicated. The problem is, that the active network (10.20.10.x)