Re: [Pacemaker] baring IPMI, what would you recommand for stonith?

2013-11-07 Thread mark - pacemaker list
Hello, On Thu, Nov 7, 2013 at 1:38 PM, Jean-Francois Malouin jean-francois.malo...@bic.mni.mcgill.ca wrote: ... the hardware that they dropped on my lap doesn't have IPMI and I will definitely require stonith. What would you recommend? A switchable PDU/power fencing? Do you have shared

Re: [Pacemaker] why does pacemaker migrate a vm by stopping and starting instead of migrating action?

2012-12-19 Thread mark - pacemaker list
Oops, I haven't have my coffee yet this morning... I see you've written your own RA rather than using the existing ones, my apologies for the noise on the list. Mark On Wed, Dec 19, 2012 at 9:08 AM, mark - pacemaker list m+pacema...@nerdish.us wrote: Hi Cherish, On Wed, Dec 19, 2012 at 1:11

Re: [Pacemaker] Why was SBD removed from the RHEL/CentOS 6 cluster-glue/corosync/pacemaker packages?

2012-08-15 Thread mark - pacemaker list
Hi Lars, On Wed, Aug 15, 2012 at 3:33 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-07-28T17:30:34, mark - pacemaker list m+pacema...@nerdish.us wrote: ... Note that sbd is being removed from cluster-glue and split into a separate package upstream nowadays, so RHT's decision merely

[Pacemaker] 'crm configure verify' says my node(s) don't exist, everything else says they do

2012-08-14 Thread mark - pacemaker list
This output kind of shows it all... I can configure the cluster, put nodes in standby and back online, move resources from one to the other, etc., but if any rule references a node in the configuration, then 'crm configure verify' fails saying the node doesn't exist. This is a freshly-started

Re: [Pacemaker] Did I miss a dependency or is this a 1.1.7 bug?

2012-08-01 Thread mark - pacemaker list
Hi Andreas, On Wed, Aug 1, 2012 at 8:07 AM, Andreas Kurz andr...@hastexo.com wrote: On 08/01/2012 06:51 AM, mark - pacemaker list wrote: Hello, ... Pacemaker 1.1.7 is included in CentOS 6.3 no need to build it for yourself... or do you try to build latest git version? Regards

Re: [Pacemaker] Did I miss a dependency or is this a 1.1.7 bug?

2012-08-01 Thread mark - pacemaker list
replicate the issue any longer this morning. I guess it went away as soon as we rolled to August 1st. Thank you, Mark On Wed, Aug 1, 2012 at 2:51 PM, mark - pacemaker list m+pacema...@nerdish.us wrote: Hello, I suspect I've missed a dependency somewhere in the build process, and I'm

[Pacemaker] Quick question regarding wiki vs 'Clusters from Scratch v2'

2012-08-01 Thread mark - pacemaker list
Good afternoon, Looking at the corosync configuration examples on the wiki ( http://www.clusterlabs.org/wiki/Initial_Configuration ) and in the Clusters from Scratch document ( http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_sample_corosync_configuration.html),

[Pacemaker] Did I miss a dependency or is this a 1.1.7 bug?

2012-07-31 Thread mark - pacemaker list
Hello, I suspect I've missed a dependency somewhere in the build process, and I'm hoping someone recognizes this as an easy fix. I've basically followed the build guide on ClusterLabs in 'Clusters from Scratch v2', the build from source section. The hosts are CentOS 6.3 x86_64. My only changes

[Pacemaker] Why was SBD removed from the RHEL/CentOS 6 cluster-glue/corosync/pacemaker packages?

2012-07-28 Thread mark - pacemaker list
Hello list, If you're building cluster-glue from source, it builds sbd. However, If you install cluster-glue, corosync, and pacemaker from official repos, there is no sbd binary. The deb for cluster-glue in Debian is version 1.0.6 rather than 1.0.5 and it has the sbd binary, so has it been

Re: [Pacemaker] VM live migration with pacemaker-1.1.6

2012-06-21 Thread mark - pacemaker list
Hi Luca, On Wed, Jun 20, 2012 at 6:38 AM, Luca Lesinigo l...@lm-net.it wrote: ... Andrew gave you the right answers for all of the above, I just wanted to add something for this question: - what could happen if all SAS links between a single node and the storage stop working? (ie,

Re: [Pacemaker] stonith/SBD question in the event of a lost node

2012-04-03 Thread mark - pacemaker list
On Tue, Apr 3, 2012 at 5:09 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-04-02T11:01:31, mark - pacemaker list m+pacema...@nerdish.us wrote: Debian's corosync/pacemaker scripts don't include a way to start SBD, you have to work up something on your own to get it started prior

[Pacemaker] stonith/SBD question in the event of a lost node

2012-04-02 Thread mark - pacemaker list
Hello, I'm just looking to verify that I'm understanding/configuring SBD correctly. It works great in the controlled cases where you unplug a node from the network (it gets fenced via SBD) or remove its access to the shared disk (the node suicides). However, In the event of a hardware failure

Re: [Pacemaker] stonith/SBD question in the event of a lost node

2012-04-02 Thread mark - pacemaker list
Hi Lars, On Mon, Apr 2, 2012 at 10:35 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-04-02T09:33:22, mark - pacemaker list m+pacema...@nerdish.us wrote: Hello, I'm just looking to verify that I'm understanding/configuring SBD correctly. It works great in the controlled cases

Re: [Pacemaker] Nodes unable to connect / find each other

2012-03-14 Thread mark - pacemaker list
Hi, On Wed, Mar 14, 2012 at 1:43 PM, Regendoerp, Achim achim.regendo...@galacoral.com wrote: Hi, ** ** Below is a cut out from the tcpdump run on both boxes. The tcpdump is the same on both boxes. The traffic only appears if I set the bindnetaddr in /etc/corosync/corosync.conf

[Pacemaker] Crash this afternoon, want to verify I'm understanding this configuration correctly

2012-02-23 Thread mark - pacemaker list
Hello, I have a pretty simple cluster running with three nodes, xen1, xen2, and qnode (which runs in standby at all times and only exists for quorum). This afternoon xen1 reset out of the blue. There is nothing in its logs, in fact there's a gap from 15:37 to 15:47: Feb 23 15:36:18 xen1 lrmd:

Re: [Pacemaker] 2 node cluster questions

2011-11-25 Thread mark - pacemaker list
Hi Dirk, On Fri, Nov 25, 2011 at 6:05 AM, Hellemans Dirk D dirk.hellem...@hpcds.comwrote: Hello everyone, ** ** I’ve been reading a lot lately about using Corosync/Openais in combination with Pacemaker: SuSe Linux documentation, Pacemaker Linux-ha website, interesting blogs,

Re: [Pacemaker] Cluster goes to (unmanaged) Failed state when both nodes are rebooted together

2011-10-24 Thread mark - pacemaker list
Hi, On Mon, Oct 24, 2011 at 9:52 AM, Alan Robertson al...@unix.sh wrote: ** Setting no-quorum-policy to ignore and disabling stonith is not a good idea. You're sort of inviting the cluster to do screwed up things. Isn't no-quorum-policy ignore sort of required for a two-node cluster?

[Pacemaker] Debian Squeeze with ocfs2?

2011-10-04 Thread mark - pacemaker list
Hi, I know this is probably a simple request, but I'm coming up with nothing as far as workable documenation for this. The only writeup I can find is a guy doing ocfs2 on DRBD, and he skipped ocfs2 in pacemaker, instead using cluster.conf. For my small setup, there's no DRBD in the equation,

Re: [Pacemaker] Debian Unstable (sid) Problem with Pacemaker/Corosync Apache HA-Load Balanced cluster

2011-10-01 Thread mark - pacemaker list
On Sat, Oct 1, 2011 at 5:32 AM, Miltiadis Koutsokeras m.koutsoke...@biovista.com wrote: From the messages it seems like the manager is getting unexpected exit codes from the Apache resource. The server-status URL is accessible from 127.0.0.1 in both nodes. Am I understanding correctly

Re: [Pacemaker] How to prevent a node that joins the cluster after reboot from starting the resources.

2011-08-25 Thread mark - pacemaker list
Hello, On Mon, Aug 22, 2011 at 2:55 AM, ihjaz Mohamed ihjazmoha...@yahoo.co.inwrote: Hi, Has any one here come across this issue?. Sorry for the delay, but I wanted to respond and let you know that I'm also having this issue. I can pretty reliably kill a pretty simple cluster setup by

Re: [Pacemaker] How to prevent a node that joins the cluster after reboot from starting the resources.

2011-08-25 Thread mark - pacemaker list
Hello again, Replying to my own message with a for the archives post, my issue with services being started concurrently after a node reboot came down to the fact that I'm using the VirtualDomain RA, but by default CentOS 6.0 and Scientific Linux 6.1 (and presumably RHEL6 as well) start libvirtd

[Pacemaker] RHEL6 / Scientific Linux 6: cluster-glue no longer includes stonith agents?

2011-08-23 Thread mark - pacemaker list
Hello, I'm trying to replicate a cluster I initially built for testing on CentOS 5.6, but with the fresher packages that come along with a 6.x release. CentOS is still playing catch-up, so their 6.0 pacemaker packages are a bit older. Based on that, I figured I'd try Scientific Linux 6.1 since

Re: [Pacemaker] Live demo of Pacemaker Cloud on Fedora: Friday August 5th at 8am PST

2011-08-07 Thread mark - pacemaker list
Hi Steve, On Thu, Aug 4, 2011 at 11:22 AM, Steven Dake sd...@redhat.com wrote: ... Yes I will record if I can beat elluminate into submission. Regards -steve Did you get to record this talk? I'd also love to see it, but it wasn't possible for me to catch it live on Friday. Thanks,

Re: [Pacemaker] Mail notification for fencing action

2011-06-15 Thread mark - pacemaker list
On Wed, Jun 15, 2011 at 12:24 PM, imnotpc imno...@rock3d.net wrote: What I was thinking is that the DC is never fenced Is this actually the case? It would sure explain the one gotcha I've never been able to work around in a three node cluster with stonith/SBD. If you unplug the network

Re: [Pacemaker] How to ensure that a resource is only running at one place?

2011-05-24 Thread mark - pacemaker list
Hi Kevin, On Tue, May 24, 2011 at 9:12 AM, Kevin Stevenard ksteven...@gmail.comwrote: Because by default on my asymmetric cluster I saw that the op monitor action is only executed on the node where the resource is currently running, and when a user start manually (not through the crm) the

Re: [Pacemaker] crm_mon doesn't seem to run daemonized upon interval

2011-04-27 Thread mark - pacemaker list
Hi Phil, On Wed, Apr 27, 2011 at 10:18 AM, Phil Hunt phil.h...@orionhealth.com wrote: Using ocf:heartbeat:clustermon starts up a daemonized crm_mon with the follwing command: /usr/sbin/crm_mon -p /tmp/ClusterMon_ClusterMon.pid -d -i 15 -h /data/apache/www/html/crm_mon.html And it does,

Re: [Pacemaker] Resources won't start

2011-04-19 Thread mark - pacemaker list
Hi Phil, On Tue, Apr 19, 2011 at 3:36 PM, Phil Hunt phil.h...@orionhealth.com wrote: Hi I have iscsid running, no iscsi. Good. You don't want the system to auto-connect the iSCSI disks on boot, pacemaker will do that for you. Here is the crm status: Last updated: Tue Apr

Re: [Pacemaker] Using Pacemaker/Corosync to manage 2 node SHARED-DISK Cluster

2011-04-08 Thread mark - pacemaker list
Hi Phil, On Fri, Apr 8, 2011 at 11:13 AM, Phil Hunt phil.h...@orionhealth.com wrote: Hi I have been playing with DRBD, thats cool But I have 2 VM RHEL linux boxes.  They each have a boot device (20g) and a shared ISCSI 200G volume. I've played with ucarp and have the commands to make