Re: [Pacemaker] RHEL 6.3 + fence_vmware_soap + esx 5.1

2013-07-21 Thread Andrew Beekhof
[1498] pcmk1 crmd: info: process_lrm_event: LRM operation drbd_pg:0_notify_0 (call=28, rc=0, cib-update=0, confirmed=true) ok Regards, Michal Mistina -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net] Sent: Tuesday, July 16, 2013 5:23 AM To: The Pacemaker

Re: [Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-20 Thread Andrew Beekhof
Fixed! https://github.com/beekhof/pacemaker/commit/28fa4c6 On 16/07/2013, at 11:11 PM, Andrew Beekhof and...@beekhof.net wrote: On 16/07/2013, at 5:37 PM, Johan Huysmans johan.huysm...@inuits.be wrote: Hi, Attached is some more logging of the failed diff: I hope this is sufficient

Re: [Pacemaker] Node sends shutdown request to other node:error: handle_request: We didn't ask to be shut down, yet our DC is telling us too

2013-07-18 Thread Andrew Beekhof
at 4:31 AM, Andrew Beekhof and...@beekhof.net wrote: On 16/07/2013, at 11:03 PM, K Mehta kiranmehta1...@gmail.com wrote: I have a two node test cluster running with CMAN plugin. Fencing is not configured. Thats problem 1 I see that vsanqa7 sends a message to vsanqa8 to shutdown

Re: [Pacemaker] Question about the behavior when a pacemaker's process crashed

2013-07-18 Thread Andrew Beekhof
On 17/07/2013, at 6:53 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (13.07.16 21:18), Andrew Beekhof wrote: On 16/07/2013, at 7:04 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (13.07.15 11:00), Andrew Beekhof wrote: On 12/07/2013, at 6:28 PM, Kazunori INOUE inouek

Re: [Pacemaker] libqb installed in non-standard dir causes configure failures

2013-07-18 Thread Andrew Beekhof
On 19/07/2013, at 4:37 AM, Matthew O'Connor m...@ecsorl.com wrote: Hi! I'm trying to build a full pacemaker/corosync/libqb tree into a directory other than /usr. Unfortunately the Pacemaker configure script fails to properly validate the existence of libqb: checking pkg-config is at least

Re: [Pacemaker] DRBD Stacked resource timeout

2013-07-18 Thread Andrew Beekhof
On 19/07/2013, at 7:08 AM, Miles Lott ml...@gie.com wrote: This is the part of our pacemaker config which sets up our primary/secondary drbd resource as well as the stacked resource. Due to occasional load on the server, the resource check will timeout, forcing a reboot of the node using

Re: [Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-16 Thread Andrew Beekhof
be very useful. Also the output of cibadmin -Ql Thx. Johan On 15-07-13 03:02, Andrew Beekhof wrote: On 11/07/2013, at 5:09 PM, Johan Huysmans johan.huysm...@inuits.be wrote: Hi, Sorry about the missing info, here it is: OS: CentOS 6.4 CoroSync: 1.4.1-15 Pacemaker: 1.1.10-1.el6

Re: [Pacemaker] Issue with an isolated node overriding CIB after rejoining main cluster

2013-07-16 Thread Andrew Beekhof
, is there some configuration I can apply to change behaviour to ifdown? My major fear is that some network failure could trigger the code path that leads to the isolated node updating CIB, etc. Thanks again, Tom -Original Message- From: Andrew Beekhof [mailto:and...@beekhof.net

Re: [Pacemaker] Question about the behavior when a pacemaker's process crashed

2013-07-16 Thread Andrew Beekhof
On 16/07/2013, at 7:04 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (13.07.15 11:00), Andrew Beekhof wrote: On 12/07/2013, at 6:28 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: Hi, I'm using pacemaker-1.1.10. When a pacemaker's process crashed, the node is sometimes

Re: [Pacemaker] Dhcp network interface as resource

2013-07-16 Thread Andrew Beekhof
On 17/07/2013, at 2:33 AM, Tero M term...@gmail.com wrote: Hi, is it possible to set up network interface as resource so that Pacemaker enables and disables it as needed? I have need to set up network interface that has specific MAC address but IP address is determined with dhcp. If you

Re: [Pacemaker] Node sends shutdown request to other node:error: handle_request: We didn't ask to be shut down, yet our DC is telling us too

2013-07-16 Thread Andrew Beekhof
On 16/07/2013, at 11:03 PM, K Mehta kiranmehta1...@gmail.com wrote: I have a two node test cluster running with CMAN plugin. Fencing is not configured. Thats problem 1 I see that vsanqa7 sends a message to vsanqa8 to shutdown. However, it is not clear why vsanqa7 takes this decision. It

Re: [Pacemaker] version compatility

2013-07-16 Thread Andrew Beekhof
On 16/07/2013, at 11:59 PM, K Mehta kiranmehta1...@gmail.com wrote: Hi, Where can I find information about which versions of pacemaker, cman and corosync are compatible with each other ? I don't think such a document exists. What distro are you using? The best option is usually to use

Re: [Pacemaker] Issue with an isolated node overriding CIB after rejoining main cluster

2013-07-14 Thread Andrew Beekhof
On 12/07/2013, at 10:49 PM, Howley, Tom tom.how...@hp.com wrote: Hi, pacemaker:1.1.6-2ubuntu3, ouch corosync:1.4.2-2, drbd8-utils 2:8.3.11-0ubuntu1 I have a three node setup, with two nodes running DRBD, resource-level fencing enabled (‘resource-and-stonith’) and obviously stonith

Re: [Pacemaker] why pacemaker stops cman?

2013-07-14 Thread Andrew Beekhof
When Pacemaker is used with cman, Pacemaker provides the fencing capabilities. So at the point Pacemaker is stopped, fencing is impossible. This makes it unsafe for cman (which initiates fencing for its own reasons) to continue running. Also, CMAN's fenced is not very good at shutting down

Re: [Pacemaker] cib_process_diff: Failed application of an update diff

2013-07-14 Thread Andrew Beekhof
On 11/07/2013, at 5:09 PM, Johan Huysmans johan.huysm...@inuits.be wrote: Hi, Sorry about the missing info, here it is: OS: CentOS 6.4 CoroSync: 1.4.1-15 Pacemaker: 1.1.10-1.el6-3463b39 (rc6) Any more info needed to investigate this error? Could you set

Re: [Pacemaker] RHEL 6.3 + fence_vmware_soap + esx 5.1

2013-07-14 Thread Andrew Beekhof
On 13/07/2013, at 10:05 PM, Mistina Michal michal.mist...@virte.sk wrote: Hi, Does somebody know how to set up fence_vmware_soap correctly so that it will start fencing vmware machine in the esx 5.1? My problem is the fence_vmware_soap resource agent for stonith timed out. Don’t know

Re: [Pacemaker] Question about the behavior when a pacemaker's process crashed

2013-07-14 Thread Andrew Beekhof
On 12/07/2013, at 6:28 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: Hi, I'm using pacemaker-1.1.10. When a pacemaker's process crashed, the node is sometimes fenced or is not sometimes fenced. Is this the assumed behavior? Yes. Sometimes the dev1 respawns the processes fast

Re: [Pacemaker] Different behaviour for cloned resource on 1 and 2 nodes

2013-07-14 Thread Andrew Beekhof
On 11/07/2013, at 12:21 AM, Johan Huysmans johan.huysm...@inuits.be wrote: Hi All, I have a setup with a cloned resource and a resource group. I also configured some colocation and order rules in such way that the group can only run where the cloned resource is running. On a 2 node

Re: [Pacemaker] Using avoids location constraint

2013-07-11 Thread Andrew Beekhof
On 10/07/2013, at 11:32 PM, Andrew Morgan andrewjamesmor...@gmail.com wrote: When I look at the log files, I see that there's an attempt to fence drbd1 even though I have nvpair id=cib-bootstrap-options-stonith-enabled name=stonith-enabled value=false/ in the CIB. Why would the cluster

Re: [Pacemaker] Using avoids location constraint

2013-07-10 Thread Andrew Beekhof
On 09/07/2013, at 3:59 PM, Andrew Morgan andrewjamesmor...@gmail.com wrote: On 9 July 2013 04:11, Andrew Beekhof and...@beekhof.net wrote: On 08/07/2013, at 11:35 PM, Andrew Morgan andrewjamesmor...@gmail.com wrote: Thanks Florian. The problem I have is that I'd like to define

Re: [Pacemaker] crmsh dosn't respect the acl read permissions

2013-07-09 Thread Andrew Beekhof
On 09/07/2013, at 3:29 PM, emmanuel segura emi2f...@gmail.com wrote: Hi I compiled pacemaker using the following commands git clone git://github.com/ClusterLabs/pacemaker.git cd pacemaker make rpm-dep make rpm But the acls are not enable by default? no Thanks 2013/7/9

Re: [Pacemaker] crmsh dosn't respect the acl read permissions

2013-07-09 Thread Andrew Beekhof
On 09/07/2013, at 4:58 PM, emmanuel segura emi2f...@gmail.com wrote: Hello Andrew please, can you tell me why? Because its easy to turn on for anyone that wants it Thanks 2013/7/9 Andrew Beekhof and...@beekhof.net On 09/07/2013, at 3:29 PM, emmanuel segura emi2f...@gmail.com

Re: [Pacemaker] Java application failover problem

2013-07-09 Thread Andrew Beekhof
On 09/07/2013, at 10:29 PM, Martin Gazak martin.ga...@microstep-mis.sk wrote: Dňa 7/9/2013 12:56 PM Andrew Beekhof wrote / napísal(a): On 09/07/2013, at 8:49 PM, Martin Gazak martin.ga...@microstep-mis.sk wrote: Dňa 7/9/2013 12:42 PM Andrew Beekhof wrote / napísal(a): On 09/07/2013

Re: [Pacemaker] Question concerning pacemaker-1-1-10-rc6

2013-07-08 Thread Andrew Beekhof
On 08/07/2013, at 5:57 PM, Andreas Mock andreas.m...@web.de wrote: Hi Andrew, I'm taking the builds from http://clusterlabs.org/rpm-test-next/rhel-6/x86_64/ to avoid compiling on my own. Do these build relate to the release candidates you're announcing? Not at all, those are whatever I

Re: [Pacemaker] Full API description for Fence Agent

2013-07-08 Thread Andrew Beekhof
programmers. Best regards Andreas Mock -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Montag, 8. Juli 2013 05:27 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Full API description for Fence Agent On 04/07/2013, at 9:52

Re: [Pacemaker] Best way to notify stonith action

2013-07-08 Thread Andrew Beekhof
On 09/07/2013, at 12:50 AM, Andreas Mock andreas.m...@web.de wrote: Hi all, thank you for your recommendations. I just hoped that there is something pacemaker internal, e.g. like sending traps via snmp or something like that. This is something crm_mon can now send traps, emails or call

Re: [Pacemaker] Using avoids location constraint

2013-07-08 Thread Andrew Beekhof
On 08/07/2013, at 11:35 PM, Andrew Morgan andrewjamesmor...@gmail.com wrote: Thanks Florian. The problem I have is that I'd like to define a HA configuration that isn't dependent on a specific set of fencing hardware (or any fencing hardware at all for that matter) and as the stack has

Re: [Pacemaker] Pacemaker 1.1.10 rc 5 rc 6

2013-07-08 Thread Andrew Beekhof
On 06/07/2013, at 1:28 AM, Andrii Moiseiev amoise...@gmail.com wrote: Any clues? Not with only fragments of the logs Do you have a log file (not syslog) configured? I'll need that complete file If you could also set PCMK_trace_function=apply_xml_diff (see

Re: [Pacemaker] Java application failover problem

2013-07-08 Thread Andrew Beekhof
Can you include a crm_report for your test scenario? a) I need the pe files, but also b) parsing line wrapped logs is seriously painful On 05/07/2013, at 7:09 PM, Martin Gazak martin.ga...@microstep-mis.sk wrote: Hello, we are facing the problem with the simple (I hope) cluster configuration

Re: [Pacemaker] Wrong resource failcount displayed for clone resource

2013-07-08 Thread Andrew Beekhof
On 03/07/2013, at 5:47 PM, Cédric VERKLEEREN cedric.verklee...@crelan.be wrote: Hi, I am using Pacemaker 1.1.7 with Corosync 1.4.1. After simulating an apache resource failure, the failcount increase. Using the command crm resource failcount rsc show node for this clone resource show me

Re: [Pacemaker] after `corosync stop` on master, drbd:master moves to peer, but all other resource stopped

2013-07-08 Thread Andrew Beekhof
On 19/06/2013, at 3:03 AM, andreas graeper agrae...@googlemail.com wrote: hi, i stopped n1 (drbd:master + all managed resources) the n2 became drbd:master but all resources stopped. i started n1.corosync again and stopped once more and than n2 took over everything as expected. in

Re: [Pacemaker] Node addition policy

2013-07-08 Thread Andrew Beekhof
On 04/07/2013, at 9:55 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 04.07.2013 14:50, Andrew Beekhof wrote: On 04/07/2013, at 7:24 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I think about safest way to expanding the cluster, and my observations show that new nodes

Re: [Pacemaker] Full API description for Fence Agent

2013-07-07 Thread Andrew Beekhof
filtering is done? There is no filtering. Best regards Andreas -Ursprüngliche Nachricht- Von: Andrew Beekhof [mailto:and...@beekhof.net] Gesendet: Donnerstag, 4. Juli 2013 13:41 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Full API description

Re: [Pacemaker] Another question about fencing/stonithing

2013-07-07 Thread Andrew Beekhof
On 06/07/2013, at 1:22 AM, Digimer li...@alteeve.ca wrote: Andrew might know the trick. In theory, putting your agent into the /usr/sbin or /sbin directory (where ever the other agents are) Yep. As long as its there, executable and takes arguments via stdin... should just work. You're sure

Re: [Pacemaker] Another question about fencing/stonithing

2013-07-07 Thread Andrew Beekhof
On 05/07/2013, at 5:34 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, I just wrote a stonith agent which IMHO implements the API spec found at https://fedorahosted.org/cluster/wiki/FenceAgentAPI. But it seems it has a problem when used as pacemaker stonith device. What has to be

Re: [Pacemaker] changing cluster-ip

2013-07-04 Thread Andrew Beekhof
On 04/07/2013, at 8:37 PM, Leon Fauster leonfaus...@googlemail.com wrote: Am 04.07.2013 um 12:02 schrieb andreas graeper agrae...@googlemail.com: i tried to change the IPaddr2 parameter ip 1) crm resource edit rsc 2) pcs resource update rsc attr=value in both cases the cib is

Re: [Pacemaker] Node addition policy

2013-07-04 Thread Andrew Beekhof
On 04/07/2013, at 7:24 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi, I think about safest way to expanding the cluster, and my observations show that new nodes are always added in the online state (standby=off). I would like nodes to appear in standby=on state unless they can be

Re: [Pacemaker] some pacemaker questions

2013-07-02 Thread Andrew Beekhof
On 02/07/2013, at 7:32 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-02T10:46:09, Andrew Beekhof and...@beekhof.net wrote: Our problem is that if i give crm resource stop vm1 and immediatly after crm resource stop vm2 it happens that pacemaker begins to stop vm2 only after vm1

Re: [Pacemaker] Disconnected from CIB?

2013-07-02 Thread Andrew Beekhof
On 02/07/2013, at 7:54 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-02T08:25:18, Andrew Beekhof and...@beekhof.net wrote: if (cli_config_update(cib_copy, NULL, FALSE) == FALSE) { Also, change FALSE - TRUE here so that you see the validation errors. OK. What could cause

Re: [Pacemaker] Disconnected from CIB?

2013-07-02 Thread Andrew Beekhof
On 02/07/2013, at 8:20 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-02T20:12:08, Andrew Beekhof and...@beekhof.net wrote: It seems related to the number of times I poll the CIB, too; I seem to hit a transient window there, maybe. Since I dropped the number of polls (instead

Re: [Pacemaker] setup advice

2013-07-02 Thread Andrew Beekhof
I wouldn't be doing anything without corosync2 and its option that requires all nodes to be online before quorum is granted. Otherwise I can imagine ways that the old master might try to promote itself. On 02/07/2013, at 7:18 PM, Michael Schwartzkopff mi...@clusterbau.com wrote: Am Dienstag,

Re: [Pacemaker] drbd on passive node not started

2013-07-02 Thread Andrew Beekhof
On 21/06/2013, at 11:36 PM, andreas graeper agrae...@googlemail.com wrote: hi, n1 active node is started and everything works fine, but after reboot n2 drbd is not started by pacemaker. Are you sure? I see a couple of start operations: Jun 21 15:10:29 [5093] n2 lrmd:debug:

Re: [Pacemaker] Disconnected from CIB?

2013-07-01 Thread Andrew Beekhof
On 30/06/2013, at 10:09 PM, Lars Marowsky-Bree l...@suse.com wrote: Hi, sbd connects to the CIB and watches updates come in to see if pacemaker considers the node healthy still, and if the cluster partition is quorate according to the CIB. That's all working fine. But I've noticed that

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 5:32 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 29.06.2013 02:22, Andrew Beekhof wrote: On 29/06/2013, at 12:22 AM, Digimer li...@alteeve.ca wrote: On 06/28/2013 06:21 AM, Andrew Beekhof wrote: On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree l...@suse.com wrote

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 5:17 PM, Florian Crouzat gen...@floriancrouzat.net wrote: Le 29/06/2013 01:22, Andrew Beekhof a écrit : On 29/06/2013, at 12:22 AM, Digimer li...@alteeve.ca wrote: On 06/28/2013 06:21 AM, Andrew Beekhof wrote: On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree l

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 30/06/2013, at 4:48 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-29T09:22:20, Andrew Beekhof and...@beekhof.net wrote: This doesn't help people who have dual power rails/PDUs for power redundancy. I'm yet to be convinced that having two PDUs is helping those people

Re: [Pacemaker] some pacemaker questions

2013-07-01 Thread Andrew Beekhof
On 28/06/2013, at 10:48 PM, Sartoratti Lorenzo lorenzo.sartora...@dei.unipd.it wrote: Hi Il 06/28/2013 12:30 PM, Andrew Beekhof ha scritto: On 28/06/2013, at 5:19 AM, Lorenzo Sartoratti lorenzo.sartora...@dei.unipd.it wrote: Hi, we are using pacemaker since two years and we are quite

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 9:53 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-01T21:37:38, Andrew Beekhof and...@beekhof.net wrote: And apparently, this is one of the scenarios for which fence topology was created and supports multiple devices per level. I'd venture the opinion

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 10:06 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 01.07.2013 14:53, Andrew Beekhof wrote: On 01/07/2013, at 9:45 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 01.07.2013 14:14, Andrew Beekhof wrote: ... I'm yet to be convinced that having two PDUs

Re: [Pacemaker] Disconnected from CIB?

2013-07-01 Thread Andrew Beekhof
On 02/07/2013, at 12:12 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-01T14:15:01, Lars Marowsky-Bree l...@suse.com wrote: Reproducible on the non-DC node during full start-up of a cluster, yes. And it turns out to be a CIB problem afterall. Or I'm doing something else wrong:

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 02/07/2013, at 2:13 AM, Digimer li...@alteeve.ca wrote: Yes, but people around here also tend to be quite vocal when they think something is missing. More so if its something critical. digimer whistles innocently... I mean more than you, Jake and Vladislav. That's not quite a party

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 10:19 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 01.07.2013 15:10, Andrew Beekhof wrote: And if people start using it, then we might look at simplifying it. May be it's worth to have anonymous poll at clusterlabs.org for that? I'll try and put one up today

Re: [Pacemaker] Question to fencing/stonithing

2013-07-01 Thread Andrew Beekhof
On 01/07/2013, at 10:28 PM, Andreas Mock andreas.m...@web.de wrote: Hi all, just want to get clear about startup fencing. Scenario: RHEL 6.4, cman, 2-node-cluster, pacemaker, fence via pcmk-redirect. pacemaker stonith enabled, no-quorum-policy=ignore, CMAN_QUORUM_TIMEOUT=0 When

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
On 02/07/2013, at 8:51 AM, Andrew Beekhof and...@beekhof.net wrote: On 01/07/2013, at 10:19 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 01.07.2013 15:10, Andrew Beekhof wrote: And if people start using it, then we might look at simplifying it. May be it's worth to have

Re: [Pacemaker] some pacemaker questions

2013-07-01 Thread Andrew Beekhof
Apparently CC'ing the list on my replies was too subtle... can you please sign up to and reply to the mailing list? I don't do private support. On 28/06/2013, at 10:48 PM, Sartoratti Lorenzo lorenzo.sartora...@dei.unipd.it wrote: Our problem is that if i give crm resource stop vm1 and

Re: [Pacemaker] Disconnected from CIB?

2013-07-01 Thread Andrew Beekhof
On 02/07/2013, at 12:12 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-07-01T14:15:01, Lars Marowsky-Bree l...@suse.com wrote: Reproducible on the non-DC node during full start-up of a cluster, yes. And it turns out to be a CIB problem afterall. Or I'm doing something else wrong:

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-07-01 Thread Andrew Beekhof
Is How important is the ability to use redundant PDUs for fencing? better? On 02/07/2013, at 3:30 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 02.07.2013 03:10, Andrew Beekhof wrote: On 02/07/2013, at 8:51 AM, Andrew Beekhof and...@beekhof.net wrote: On 01/07/2013, at 10:19 PM

[Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:30 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-28T11:11:00, Andrew Beekhof and...@beekhof.net wrote: Maybe you're right, maybe I should stop fighting it and go with the firefox approach. That certainly seemed to piss a lot of people off though... If there's

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-27T12:53:01, Digimer li...@alteeve.ca wrote: primitive fence_n01_psu1_off stonith:fence_apc_snmp \ params ipaddr=an-p01 pcmk_reboot_action=off port=1 pcmk_host_list=an-c03n01.alteeve.ca primitive

Re: [Pacemaker] some pacemaker questions

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 5:19 AM, Lorenzo Sartoratti lorenzo.sartora...@dei.unipd.it wrote: Hi, we are using pacemaker since two years and we are quite satisfied: thanks! We have 30 virtual machines running in the cluster and maintained by pacemaker. When we stop the machines with crm, they are

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:46 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-28T20:21:22, Andrew Beekhof and...@beekhof.net wrote: It looks correct, but not quite sane. ;-) That seems not to be something you can address, though. I'm thinking that fencing topology should be smart enough

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:10 PM, Andrew Beekhof and...@beekhof.net wrote: On 28/06/2013, at 6:42 PM, Bernardo Cabezas Serra bcabe...@apsl.net wrote: Hello Andrew, El 27/06/13 14:44, Andrew Beekhof escribió: You should see additional logs sent to /var/log/pacemaker.log Finally yesterday

Re: [Pacemaker] WARNINGS and ERRORS on syslog after update to 1.1.7

2013-06-28 Thread Andrew Beekhof
On 27/06/2013, at 10:46 PM, Andrew Beekhof and...@beekhof.net wrote: On 25/06/2013, at 9:44 PM, Francesco Namuri f.nam...@credires.it wrote: Can you attach /var/lib/pengine/pe-input-64.bz2 from SERVERNAME1 please? I'll be able to see if its something we've already fixed. Nope still

Re: [Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 8:59 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-28T18:41:35, Andrew Beekhof and...@beekhof.net wrote: There's an exception: dropping commonly used external interfaces (say, ptest) needs to be announced a few releases in advance before enacted upstream

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 9:16 PM, Andrew Beekhof and...@beekhof.net wrote: On 28/06/2013, at 8:10 PM, Andrew Beekhof and...@beekhof.net wrote: On 28/06/2013, at 6:42 PM, Bernardo Cabezas Serra bcabe...@apsl.net wrote: Hello Andrew, El 27/06/13 14:44, Andrew Beekhof escribió: You should

Re: [Pacemaker] Release model

2013-06-28 Thread Andrew Beekhof
On 28/06/2013, at 11:37 PM, Lars Marowsky-Bree l...@suse.com wrote: I'm not sure there's a huge downside in it for you? Ok, lets take attrd for example - which I've been wanted to rewrite to be truly atomic for half a decade or more. If it's rewritten in a way that doesn't affect

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 29/06/2013, at 12:22 AM, Digimer li...@alteeve.ca wrote: On 06/28/2013 06:21 AM, Andrew Beekhof wrote: On 28/06/2013, at 5:22 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-27T12:53:01, Digimer li...@alteeve.ca wrote: primitive fence_n01_psu1_off stonith:fence_apc_snmp

Re: [Pacemaker] Fixed! - Re: Problem with dual-PDU fencing node with redundant PSUs

2013-06-28 Thread Andrew Beekhof
On 29/06/2013, at 12:36 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-28T10:20:56, Digimer li...@alteeve.ca wrote: primitive fence_n01_psu1_off stonith:fence_apc_snmp \ params ipaddr=an-p01 pcmk_reboot_action=off port=1 pcmk_host_list=an-c03n01.alteeve.ca primitive

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 8:20 PM, Bernardo Cabezas Serra bcabe...@apsl.net wrote: ¿Do you think it's a configuration problem? No, more likely a bug. Which is concerning since I thought I had this particular kind ironed out. Could you set PCMK_trace_functions=crm_get_peer on selavi and repeat the

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 5:40 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-27T14:28:19, Andrew Beekhof and...@beekhof.net wrote: I wouldn't say the 6 months between 1.1.7 and 1.1.8 was a particularly aggressive release cycle. For the amount of changes in there, I think yes

Re: [Pacemaker] [OT] MySQL Replication

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 1:53 AM, Denis Witt denis.w...@concepts-and-training.de wrote: On Wed, 26 Jun 2013 21:33:30 +1000 Andrew Beekhof and...@beekhof.net wrote: When you run ./autogen.sh it tries to start an rpm command, this failed because I didn't had rpm installed. How did it fail

Re: [Pacemaker] Node name problems after upgrading to 1.1.9

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 10:29 PM, Bernardo Cabezas Serra bcabe...@apsl.net wrote: Hello, Ohhh, sorry, but I have deleted node selavi and restarted, and now works OK and I can't reproduce the bug :( That is unfortunate El 27/06/13 12:32, Andrew Beekhof escribió: o, more likely a bug. Which

Re: [Pacemaker] WARNINGS and ERRORS on syslog after update to 1.1.7

2013-06-27 Thread Andrew Beekhof
On 25/06/2013, at 9:44 PM, Francesco Namuri f.nam...@credires.it wrote: Can you attach /var/lib/pengine/pe-input-64.bz2 from SERVERNAME1 please? I'll be able to see if its something we've already fixed. Nope still there. I will attempt to fix this tomorrow.

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-27 Thread Andrew Beekhof
On 28/06/2013, at 12:52 AM, Lars Marowsky-Bree l...@suse.com wrote: Maybe you're right, maybe I should stop fighting it and go with the firefox approach. That certainly seemed to piss a lot of people off though... If there's one message I've learned in 13 years of work on Linux HA, then

Re: [Pacemaker] corosync stop and consequences

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 3:40 AM, andreas graeper agrae...@googlemail.com wrote: thanks four your answer. but still question open. when i switch off the active node: though this is done reliable for me, the still passive node wants to know for sure and will kill the (already dead) former

Re: [Pacemaker] weird drbd/cluster behaviour

2013-06-27 Thread Andrew Beekhof
On 27/06/2013, at 2:21 AM, Саша Александров shurr...@gmail.com wrote: Hi! Fencing is disabled for now, the issue is not with fencing: the question is - why only one out of three DRBD master-slave sets is recognized by pacemaker, Pacemaker knows nothing of drbd or any other kind of

Re: [Pacemaker] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-26 Thread Andrew Beekhof
Sent from a mobile device On 26/06/2013, at 5:44 PM, Jacek Konieczny jaj...@jajcus.net wrote: On Wed, 26 Jun 2013 14:35:03 +1000 Andrew Beekhof and...@beekhof.net wrote: Urgh: infoJun 25 13:40:10 lrmd_ipc_connect(913):0: Connecting to lrmd trace Jun 25 13:40:10 pick_ipc_buffer(670

Re: [Pacemaker] corosync stop and consequences

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 12:24 AM, Digimer li...@alteeve.ca wrote: On 06/25/2013 07:29 AM, andreas graeper wrote: hi, maybe again and again the same question, please excuse. two nodes (n1 active / n2 passive) and `service corosync stop` on active. does the node, that is going down, tells the

Re: [Pacemaker] [OT] MySQL Replication

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 6:51 PM, Denis Witt denis.w...@concepts-and-training.de wrote: On Wed, 26 Jun 2013 12:35:33 +1000 Andrew Beekhof and...@beekhof.net wrote: System is Debian Wheezy which means version 0.11.1-2 for libqb-dev. rpm errors on debian? I'm confused. When you run

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 7:30 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-25T20:28:29, Andrew Beekhof and...@beekhof.net wrote: Perhaps a numbering scheme like the Linux kernel would fit better than a stable/unstable branch distinction. Changes that deserve the unstable term are really

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-26 Thread Andrew Beekhof
On 26/06/2013, at 10:37 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-26T21:31:14, Andrew Beekhof and...@beekhof.net wrote: Distributions can take care of them when they integrate them; basically they'll trickle through until the whole stack the distributions ship builds again

Re: [Pacemaker] [corosync] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-25 Thread Andrew Beekhof
On 25/06/2013, at 4:34 PM, Jacek Konieczny jaj...@jajcus.net wrote: On Tue, 25 Jun 2013 10:10:13 +1000 Andrew Beekhof and...@beekhof.net wrote: On 24/06/2013, at 9:31 PM, Jacek Konieczny jaj...@jajcus.net wrote: After I have upgraded Pacemaker from 1.1.8 to 1.1.9 on a node I get

Re: [Pacemaker] [corosync] pacemaker/corosync: error: qb_sys_mmap_file_open: couldn't open file

2013-06-25 Thread Andrew Beekhof
On 25/06/2013, at 5:56 PM, Jacek Konieczny jaj...@jajcus.net wrote: On Tue, 25 Jun 2013 10:50:14 +0300 Vladislav Bogdanov bub...@hoster-ok.com wrote: I would recommend qb 1.4.4. 1.4.3 had at least one nasty bug which affects pacemaker. Just tried that. It didn't help. Can you turn on the

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-25 Thread Andrew Beekhof
On 25/06/2013, at 6:32 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2013-06-25T10:16:58, Andrey Groshev gre...@yandex.ru wrote: Ok, I recently became engaged in the PСMK, so for me it is a surprize. The more so in all the major linux distributions version 1.1.х. Pacemaker has very

Re: [Pacemaker] WARNINGS and ERRORS on syslog after update to 1.1.7

2013-06-25 Thread Andrew Beekhof
On 25/06/2013, at 5:37 PM, Francesco Namuri f.nam...@credires.it wrote: Hi, after an update to the new debian stable, from pacemaker 1.0.9.1 to 1.1.7 I'm getting some strange errors on syslog: Thats a hell of a jump there. Can you attach /var/lib/pengine/pe-input-64.bz2 from SERVERNAME1

Re: [Pacemaker] Additional help with ClusterMon

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 6:29 PM, Michael Furman michael_fur...@hotmail.com wrote: Andrew, I have send the core files attached to mail from Date: Thu, 20 Jun 2013 13:18:04 +0300. Unfortunately the mail server did not sent it. Can you look on the mail server? 100s of spam messages but nothing

Re: [Pacemaker] [OT] MySQL Replication

2013-06-24 Thread Andrew Beekhof
On 22/06/2013, at 5:31 AM, Denis Witt denis.w...@cantbuyit.com wrote: Hi List, might be offtopic but I'm sure there are may People on this List who had answered this question for themselfs. I have a MySQL Master/Master/Slave setup which is rather unreliable, so i'm asking myself if

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 3:44 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.06.2013 04:17, Andrew Beekhof wrote: Either people have given up on testing, or rc5[1] is looking good for the final release. Is it going to be 1.1.10 or 1.2.0 (2.0.0)? First its going to be 1.1.10

Re: [Pacemaker] error when build pacemaker 1.1.10-rc5 and corosync-2.3.0

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 3:03 PM, Takatoshi MATSUO matsuo@gmail.com wrote: Hi Andrew 2013/6/24 Andrew Beekhof and...@beekhof.net: On 24/06/2013, at 12:46 PM, Takatoshi MATSUO matsuo@gmail.com wrote: Hi Andrew I received similar error using 6ea4b7e(HEAD) under RHEL6

Re: [Pacemaker] Can resource agents know the cause of stop action?

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 8:55 PM, Munehiro SATO satoh-...@necst.nec.co.jp wrote: Hi all, Can resource agents know the cause of stop action? Not really, no. I want to know following situations in RA for my application(it's Master/Slave resource). * stop by crm resource stop In this case,

Re: [Pacemaker] output crm_mon

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 8:37 PM, andreas graeper agrae...@googlemail.com wrote: hi, crm_mon -rA1 shows : ClusterIP(ocf::heartbeat:IPaddr2):Started lisel1 Master/Slave Set: ms_drbd_r0 [p_drbd_r0] Masters: [ lisel1 ] Slaves: [ lisel2 ] p_lvm_r0(ocf::heartbeat:LVM):

Re: [Pacemaker] error when build pacemaker 1.1.10-rc5 and corosync-2.3.0

2013-06-24 Thread Andrew Beekhof
On 25/06/2013, at 12:12 PM, Takatoshi MATSUO matsuo@gmail.com wrote: 2013/6/25 Andrew Beekhof and...@beekhof.net: On 24/06/2013, at 3:03 PM, Takatoshi MATSUO matsuo@gmail.com wrote: Hi Andrew 2013/6/24 Andrew Beekhof and...@beekhof.net: On 24/06/2013, at 12:46 PM, Takatoshi

Re: [Pacemaker] Additional help with ClusterMon

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 7:45 PM, Michael Furman michael_fur...@hotmail.com wrote: What kind of status? We want to run one command that will return the status of the node in one value: online or offline (or standby). That could reasonably be added to crm_node.

Re: [Pacemaker] Additional help with ClusterMon

2013-06-24 Thread Andrew Beekhof
On 24/06/2013, at 7:45 PM, Michael Furman michael_fur...@hotmail.com wrote: What kind of status? We want to run one command that will return the status of the node in one value: online or offline (or standby). Please find attached 3 core files. I can't read the core files. They're

Re: [Pacemaker] error when build pacemaker 1.1.10-rc5 and corosync-2.3.0

2013-06-24 Thread Andrew Beekhof
On 25/06/2013, at 1:20 PM, Takatoshi MATSUO matsuo@gmail.com wrote: 2013/6/25 Andrew Beekhof and...@beekhof.net: On 25/06/2013, at 12:12 PM, Takatoshi MATSUO matsuo@gmail.com wrote: 2013/6/25 Andrew Beekhof and...@beekhof.net: On 24/06/2013, at 3:03 PM, Takatoshi MATSUO matsuo

Re: [Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-24 Thread Andrew Beekhof
On 25/06/2013, at 2:33 PM, Andrey Groshev gre...@yandex.ru wrote: 25.06.2013, 04:46, Andrew Beekhof and...@beekhof.net: On 24/06/2013, at 3:44 PM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 24.06.2013 04:17, Andrew Beekhof wrote: Either people have given up on testing, or rc5[1

Re: [Pacemaker] known problem with corosync 1.4.1 on centos64 ?

2013-06-23 Thread Andrew Beekhof
On 22/06/2013, at 5:13 AM, Andreas Mock andreas.m...@web.de wrote: Hi Andreas, my two cents to your questions: a) If you want to learn most, take any distro and compile the components from source and afterwards use them. = Most learned. Well, yes, but not always about clustering and

Re: [Pacemaker] Pacemaker fails to switch on or off PDU sockets with fence_wti

2013-06-23 Thread Andrew Beekhof
On 21/06/2013, at 5:38 PM, Thibaut Pouzet thibaut.pou...@lyra-network.com wrote: Le 20/06/2013 12:23, Andrew Beekhof a écrit : On 20/06/2013, at 6:51 PM, Thibaut Pouzet thibaut.pou...@lyra-network.com wrote: Le 19/06/2013 23:57, Andrew Beekhof a écrit : On 20/06/2013, at 1:57 AM

[Pacemaker] Reminder: Pacemaker-1.1.10-rc5 is out there

2013-06-23 Thread Andrew Beekhof
Either people have given up on testing, or rc5[1] is looking good for the final release. So just a reminder, we're particularly looking for feedback in the following areas: | plugin-based clusters, ACLs, the new –ban and –clear commands, and admin actions | (such as moving and stopping

Re: [Pacemaker] error when build pacemaker 1.1.10-rc5 and corosync-2.3.0

2013-06-23 Thread Andrew Beekhof
(but unpackaged) file(s) found: /usr/lib64/heartbeat/attrd /usr/lib64/heartbeat/cib /usr/lib64/heartbeat/crmd /usr/lib64/heartbeat/pengine /usr/lib64/heartbeat/stonithd /usr/sbin/crm_uuid make: *** [rpm] Error 1 - Regards, Takatoshi MATSUO 2013/6/21 Andrew Beekhof

<    5   6   7   8   9   10   11   12   13   14   >