Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-27T12:53:01, Digimer wrote: > primitive fence_n01_psu1_off stonith:fence_apc_snmp \ > params ipaddr="an-p01" pcmk_reboot_action="off" port="1" > pcmk_host_list="an-c03n01.alteeve.ca" > primitive fence_n01_psu1_on stonith:fence_apc_snmp \ > params ipaddr="an-p01" pcmk_re

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T11:11:00, Andrew Beekhof wrote: > >> Maybe you're right, maybe I should stop fighting it and go with the > >> firefox approach. > >> That certainly seemed to piss a lot of people off though... > > If there's one message I've learned in 13 years of work on Linux HA, > > then it is th

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Bernardo Cabezas Serra
Hello Andrew, El 27/06/13 14:44, Andrew Beekhof escribió: > You should see additional logs sent to /var/log/pacemaker.log Finally yesterday issue happened again. This time, node "selavi" was DC, and node "turifel" joined the cluster. Cluster was in status unmanaged. Unfortunately, I have no pace

[Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:30 PM, Lars Marowsky-Bree wrote: > On 2013-06-28T11:11:00, Andrew Beekhof wrote: > Maybe you're right, maybe I should stop fighting it and go with the firefox approach. That certainly seemed to piss a lot of people off though... >>> If there's one message I've

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 6:42 PM, Bernardo Cabezas Serra wrote: > Hello Andrew, > > El 27/06/13 14:44, Andrew Beekhof escribió: >> You should see additional logs sent to /var/log/pacemaker.log > > Finally yesterday issue happened again. This time, node "selavi" was DC, > and node "turifel" joined the

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree wrote: > On 2013-06-27T12:53:01, Digimer wrote: > >> primitive fence_n01_psu1_off stonith:fence_apc_snmp \ >>params ipaddr="an-p01" pcmk_reboot_action="off" port="1" >> pcmk_host_list="an-c03n01.alteeve.ca" >> primitive fence_n01_psu1_on st

Re: [Pacemaker] some pacemaker questions

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:19 AM, Lorenzo Sartoratti wrote: > Hi, > we are using pacemaker since two years and we are quite satisfied: thanks! > We have 30 virtual machines running in the cluster and maintained by > pacemaker. > When we stop the machines with crm, they are not stopped in parallel but

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T20:21:22, Andrew Beekhof wrote: > > It looks correct, but not quite sane. ;-) That seems not to be > > something you can address, though. I'm thinking that fencing topology > > should be smart enough to, if multiple fencing devices are specified, to > > know how to expand them to "f

Re: [Pacemaker] Release model

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T18:41:35, Andrew Beekhof wrote: > > There's an exception: dropping commonly used external interfaces (say, > > "ptest") needs to be announced a few releases in advance before enacted > > upstream. (And if Enterprise distributions want to keep something, they > > have time to prepare

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:46 PM, Lars Marowsky-Bree wrote: > On 2013-06-28T20:21:22, Andrew Beekhof wrote: > >>> It looks correct, but not quite sane. ;-) That seems not to be >>> something you can address, though. I'm thinking that fencing topology >>> should be smart enough to, if multiple fencing

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:10 PM, Andrew Beekhof wrote: > > On 28/06/2013, at 6:42 PM, Bernardo Cabezas Serra wrote: > >> Hello Andrew, >> >> El 27/06/13 14:44, Andrew Beekhof escribió: >>> You should see additional logs sent to /var/log/pacemaker.log >> >> Finally yesterday issue happened again.

Re: [Pacemaker] WARNINGS and ERRORS on syslog after update to 1.1.7

2013-06-28 Thread Andrew Beekhof
On 27/06/2013, at 10:46 PM, Andrew Beekhof wrote: > > On 25/06/2013, at 9:44 PM, Francesco Namuri wrote: > >>> Can you attach /var/lib/pengine/pe-input-64.bz2 from SERVERNAME1 please? >>> >>> I'll be able to see if its something we've already fixed. > > Nope still there. I will attempt to f

Re: [Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:59 PM, Lars Marowsky-Bree wrote: > On 2013-06-28T18:41:35, Andrew Beekhof wrote: > >>> There's an exception: dropping commonly used external interfaces (say, >>> "ptest") needs to be announced a few releases in advance before enacted >>> upstream. (And if Enterprise distrib

Re: [Pacemaker] Release model

2013-06-28 Thread Dejan Muhamedagic
Hi Lars, On Fri, Jun 28, 2013 at 12:59:22PM +0200, Lars Marowsky-Bree wrote: [...] > If > cluster-glue's LRM had had such a suite, it'd certainly have helped > tons.) It did have a regression suite. Thanks, Dejan ___ Pacemaker mailing list: Pacemaker

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T21:01:55, Andrew Beekhof wrote: > > I'd agree, but it's not multiple ports on the same device, it's multiple > > ports on *different* devices. I don't think a single fencing agent can > > handle that - it really looks like something only the higher level can > > cope with. > True, i

Re: [Pacemaker] Release model

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T14:49:06, Dejan Muhamedagic wrote: > > If cluster-glue's LRM had had such a suite, it'd certainly have > > helped tons.) > It did have a regression suite. Yes, well, but it didn't test for LRM_MAX_CHILDREN or the secret support, for example. So it didn't really document the interfa

Re: [Pacemaker] Release model

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T22:04:48, Andrew Beekhof wrote: > I think he did actually. Well, yes, but the hg history or reading the existing code would probably have been quite helpful. I'll take "not well documented", but it's hard to say the rewrite was handled very well. But I don't want to get drawn into

Re: [Pacemaker] Release model

2013-06-28 Thread Digimer
On 06/28/2013 08:04 AM, Andrew Beekhof wrote: > Under this model, not only do I have to find the time to write and test the > new addition, but I also have to: > * keep maintaining the old code until... when? > * probably write and maintain a compatibility layer > * make it possible to choose whic

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 03:22 AM, Lars Marowsky-Bree wrote: > On 2013-06-27T12:53:01, Digimer wrote: > >> primitive fence_n01_psu1_off stonith:fence_apc_snmp \ >> params ipaddr="an-p01" pcmk_reboot_action="off" port="1" >> pcmk_host_list="an-c03n01.alteeve.ca" >> primitive fence_n01_psu1_on stonith

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 06:21 AM, Andrew Beekhof wrote: > > On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree wrote: > >> On 2013-06-27T12:53:01, Digimer wrote: >> >>> primitive fence_n01_psu1_off stonith:fence_apc_snmp \ >>>params ipaddr="an-p01" pcmk_reboot_action="off" port="1" >>> pcmk_host_list=

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 07:01 AM, Andrew Beekhof wrote: > > On 28/06/2013, at 8:46 PM, Lars Marowsky-Bree wrote: > >> On 2013-06-28T20:21:22, Andrew Beekhof wrote: >> It looks correct, but not quite sane. ;-) That seems not to be something you can address, though. I'm thinking that fencing topo

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 09:28 AM, Lars Marowsky-Bree wrote: > On 2013-06-28T21:01:55, Andrew Beekhof wrote: > >>> I'd agree, but it's not multiple ports on the same device, it's multiple >>> ports on *different* devices. I don't think a single fencing agent can >>> handle that - it really looks like someth

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T10:20:56, Digimer wrote: > >> primitive fence_n01_psu1_off stonith:fence_apc_snmp \ > >> params ipaddr="an-p01" pcmk_reboot_action="off" port="1" > >> pcmk_host_list="an-c03n01.alteeve.ca" > >> primitive fence_n01_psu1_on stonith:fence_apc_snmp \ > >> params ipaddr="

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T10:27:54, Digimer wrote: > > Basically, unless we can do this better, having multiple devices per > > fence topology level needs to be considered broken and might be better > > removed. > NO NO NO NO. > > Please do not remove this. I can not use pacemaker unless I can keep the > po

Re: [Pacemaker] Release model

2013-06-28 Thread Dejan Muhamedagic
On Fri, Jun 28, 2013 at 03:32:05PM +0200, Lars Marowsky-Bree wrote: > On 2013-06-28T14:49:06, Dejan Muhamedagic wrote: > > > > If cluster-glue's LRM had had such a suite, it'd certainly have > > > helped tons.) > > It did have a regression suite. > > Yes, well, but it didn't test for LRM_MAX_CHI

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 10:39 AM, Lars Marowsky-Bree wrote: > On 2013-06-28T10:27:54, Digimer wrote: > >>> Basically, unless we can do this better, having multiple devices per >>> fence topology level needs to be considered broken and might be better >>> removed. >> NO NO NO NO. >> >> Please do not remove

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 10:36 AM, Lars Marowsky-Bree wrote: primitive fence_n01_psu1_off stonith:fence_apc_snmp \ params ipaddr="an-p01" pcmk_reboot_action="off" port="1" pcmk_host_list="an-c03n01.alteeve.ca" primitive fence_n01_psu1_on stonith:fence_apc_snmp \ params

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T11:29:35, Digimer wrote: > In rhcs, you can control the fence device's action using 'action="..."' > attribute in the element. So for us rhcs migrants, we > expect that action="..." in the fence primitive will have the same > effect. As of now, as you know, this is ignored in favou

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Lars Marowsky-Bree
On 2013-06-28T11:20:32, Digimer wrote: > Yes, a failed "on" action would then fail the method. This is > sub-optimal as FenceAgentAPI says that only the "off" portion of > "reboot" needs to succeed. However, I don't consider this a show stopper > because "on" action of PDUs simply means "re-energ

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 11:45 AM, Lars Marowsky-Bree wrote: > On 2013-06-28T11:20:32, Digimer wrote: > >> Yes, a failed "on" action would then fail the method. This is >> sub-optimal as FenceAgentAPI says that only the "off" portion of >> "reboot" needs to succeed. However, I don't consider this a show sto

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 11:34 AM, Lars Marowsky-Bree wrote: > On 2013-06-28T11:29:35, Digimer wrote: > >> In rhcs, you can control the fence device's action using 'action="..."' >> attribute in the element. So for us rhcs migrants, we >> expect that action="..." in the fence primitive will have the same >

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 9:16 PM, Andrew Beekhof wrote: > > On 28/06/2013, at 8:10 PM, Andrew Beekhof wrote: > >> >> On 28/06/2013, at 6:42 PM, Bernardo Cabezas Serra wrote: >> >>> Hello Andrew, >>> >>> El 27/06/13 14:44, Andrew Beekhof escribió: You should see additional logs sent to /var/

Re: [Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 11:37 PM, Lars Marowsky-Bree wrote: >>> >>> I'm not sure there's a huge downside in it for you? >> Ok, lets take attrd for example - which I've been wanted to rewrite to be >> truly atomic for half a decade or more. > > If it's rewritten in a way that doesn't affect external

Re: [Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 29/06/2013, at 12:15 AM, Digimer wrote: > On 06/28/2013 08:04 AM, Andrew Beekhof wrote: >> Under this model, not only do I have to find the time to write and test the >> new addition, but I also have to: >> * keep maintaining the old code until... when? >> * probably write and maintain a com

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 29/06/2013, at 12:22 AM, Digimer wrote: > On 06/28/2013 06:21 AM, Andrew Beekhof wrote: >> >> On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree wrote: >> >>> On 2013-06-27T12:53:01, Digimer wrote: >>> primitive fence_n01_psu1_off stonith:fence_apc_snmp \ params ipaddr="an-p01

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 29/06/2013, at 12:36 AM, Lars Marowsky-Bree wrote: > On 2013-06-28T10:20:56, Digimer wrote: > primitive fence_n01_psu1_off stonith:fence_apc_snmp \ params ipaddr="an-p01" pcmk_reboot_action="off" port="1" pcmk_host_list="an-c03n01.alteeve.ca" primitive fence_n01_p

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Digimer
On 06/28/2013 07:22 PM, Andrew Beekhof wrote: > > On 29/06/2013, at 12:22 AM, Digimer wrote: > >> On 06/28/2013 06:21 AM, Andrew Beekhof wrote: >>> >>> On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree wrote: >>> On 2013-06-27T12:53:01, Digimer wrote: > primitive fence_n01_psu1_off s