Re: [Pacemaker] cold-start to standby?

2014-03-01 Thread Matthew O'Connor
Perfect! This will work for me! Thank you!! -- Matthew On 03/01/2014 06:20 AM, Lars Marowsky-Bree wrote: > On 2014-03-01T00:14:25, Matthew O'Connor wrote: > >> I have had a few instances recently where circumstances conspired to >> bring my cluster down completely and m

[Pacemaker] cold-start to standby?

2014-02-28 Thread Matthew O'Connor
Hi, I have had a few instances recently where circumstances conspired to bring my cluster down completely and most non-gracefully (and this was in spite of a relatively new 10kVA UPS). When bringing the nodes back online, it would be enormously useful to me if they would go automatically into sta

Re: [Pacemaker] Where the heck is Beekhof?

2013-12-02 Thread Matthew O'Connor
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 A very wonderful congratulations to you!! Enjoy every minute! They grow so very fast... - -- Matthew O'Connor On 11/27/2013 08:04 PM, Andrew Beekhof wrote: > If you find yourself asking $subject at some point in the next couple of >

Re: [Pacemaker] staggered resource startup

2013-09-03 Thread Matthew O'Connor
et what I want, I will need to use the batch-limit in conjunction with a modified resource agent that will delay reporting success until some timeout has been reached. Ick...but better than nothing! :) Thanks for the info! -- Matthew -- Thank you! Matthew O'Connor (GPG Key ID: 55F981

[Pacemaker] staggered resource startup

2013-08-27 Thread Matthew O'Connor
Hi! I have a server that operates about 30 virtual machines. Normally it handles this load very well, but restart can be a bit dicey. I have found that by staggering the vm startups - currently done manually - the system handles the growing load much more gracefully. The sequence goes something

Re: [Pacemaker] Probably a regression of the linbit drbd agent betweenpacemaker 1.1.8 and 1.1.10

2013-08-26 Thread Matthew O'Connor
bad install paths, which were my own doing at the time. -- Matthew -- Thank you! Matthew O'Connor (GPG Key ID: 55F981C4) CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It

Re: [Pacemaker] Dual primary drbd + ocfs2: problems starting o2cb

2013-08-21 Thread Matthew O'Connor
then taking it back out, or better yet node-standby -> stop all cluster resources on node -> reboot -> start cluster resources -> node-online? This was were I ran into problems with OCFS2's various mechanisms not being properly configured/used. -- Thank you! Matthew O'Con

Re: [Pacemaker] Node level fencing advice

2013-08-21 Thread Matthew O'Connor
ick. > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/C

Re: [Pacemaker] resource status question

2013-07-31 Thread Matthew O'Connor
On 07/31/2013 01:23 PM, David Vossel wrote: > > > > - Original Message - >> From: "Matthew O'Connor" >> To: "The Pacemaker cluster resource manager" >> Sent: Wednesday, July 31, 2013 12:08:11 PM >> Subject: [Pacemaker] resourc

[Pacemaker] resource status question

2013-07-31 Thread Matthew O'Connor
n the resource agent needs to be fixed. (I'm actually working on that fix now - where would I submit a resource agent patch?) Thanks! -- Thank you! Matthew O'Connor (GPG Key ID: 55F981C4) CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged

Re: [Pacemaker] libqb installed in non-standard dircausesconfigurefailures

2013-07-29 Thread Matthew O'Connor
On 07/28/2013 08:12 PM, Andrew Beekhof wrote: > On 28/07/2013, at 2:12 PM, Matthew O'Connor wrote: > >> On 07/23/2013 07:04 PM, Andrew Beekhof wrote: >>> On 23/07/2013, at 12:32 AM, Matthew O'Connor wrote: >>> >>>> Hi Andrew, >>>>

Re: [Pacemaker] libqb installed in non-standard dir causesconfigurefailures

2013-07-27 Thread Matthew O'Connor
On 07/23/2013 07:04 PM, Andrew Beekhof wrote: > On 23/07/2013, at 12:32 AM, Matthew O'Connor wrote: > >> Hi Andrew, >> >> On 07/19/2013 12:22 AM, Andrew Beekhof wrote: >>>> I've added the PKG_CONFIG_PATH and the two libqb_ lines in an attempt to

Re: [Pacemaker] libqb installed in non-standard dir causesconfigure failures

2013-07-22 Thread Matthew O'Connor
ither have to build cluster-glue from scratch or take what I have as a compromise and go. Thanks! -- Thank you! Matthew O'Connor (GPG Key ID: 55F981C4) CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclo

[Pacemaker] libqb installed in non-standard dir causes configure failures

2013-07-18 Thread Matthew O'Connor
_ lines in an attempt to make things work, as recommended by the configure help. So far, no dice. Is this something that needs to be fixed in the autoconf/autogen stuff? Something I can submit a patch for? (sadly, not versed at all in autoconf/autogen, but willing to learn!) -- Thank yo

Re: [Pacemaker] strange drbd migration fail

2013-07-16 Thread Matthew O'Connor
ummy resource back and forth using "resource migrate" and "resource unmigrate" -- Matt On 07/16/2013 12:29 PM, Matthew O'Connor wrote: > Hi, > > Probably safe to disregard this issue... I found I was somehow not > building the latest 1.1.9. After building and

Re: [Pacemaker] strange drbd migration fail

2013-07-16 Thread Matthew O'Connor
Hi, Probably safe to disregard this issue... I found I was somehow not building the latest 1.1.9. After building and installing 1.1.9-cad5efc the problem appears to have gone away. On 07/15/2013 05:25 PM, Matthew O'Connor wrote: > I have run into a strange problem with a DRBD

[Pacemaker] strange drbd migration fail

2013-07-15 Thread Matthew O'Connor
meout="20" \ op monitor interval="10" role="Master" timeout="20" primitive p_dummy1 ocf:heartbeat:Dummy ms ms_drbd-aoe1 p_drbd-aoe1 \ meta master-max="1" notify="true" clone-max="2" master-node-max="1" clone-no

Re: [Pacemaker] accidentally starting more than one pacemakerd

2013-04-08 Thread Matthew O'Connor
On 04/08/2013 07:24 PM, Andrew Beekhof wrote: > On 09/04/2013, at 3:47 AM, Matthew O'Connor wrote: > >> On version 1.1.5... I happened to (accidentally) start a second >> instance of pacemakerd while fumbling about to get its feature list. >> After killing the

[Pacemaker] accidentally starting more than one pacemakerd

2013-04-08 Thread Matthew O'Connor
just hose my cluster? Thanks! -- Thank you! Matthew O'Connor (GPG Key ID: 55F981C4) CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only for the use of the ind

Re: [Pacemaker] pacemaker + cLVM on Ubuntu Precise (12.04)?

2013-03-17 Thread Matthew O'Connor
Hi Sven, Actually I have a little experience. Overall, once I got it working, it worked well. I have a two node cluster running 12.04 with OCFS2 over DRBD, and have connected it previously to iSCSI targets. Before it took on OCFS2, it was running GFS2 over DRBD, for which CLVM is necessary. T

Re: [Pacemaker] Migrate vm on drbd in correct order?

2013-03-16 Thread Matthew O'Connor
Hi, I hope this doesn't lead you astray, but I was under the impression that live-migrating (if that's what you're attempting) requires the disk image to be available on both nodes? Perhaps try using drbd in dual-primary mode? I assume your VM is accessing the drbd targets listed in your config

Re: [Pacemaker] resource modification and resource agent update

2013-03-16 Thread Matthew O'Connor
Thanks, Dejan! I'll give it a try. On 03/15/2013 05:49 AM, Dejan Muhamedagic wrote: > Hi, > > On Thu, Mar 14, 2013 at 11:48:16PM -0400, Matthew O'Connor wrote: >> Hi!! Two quick questions. >> >> 1. I have a resource that many other resources depend on. I n

[Pacemaker] resource modification and resource agent update

2013-03-14 Thread Matthew O'Connor
Hi!! Two quick questions. 1. I have a resource that many other resources depend on. I need to modify this "base" resource, but if I modify this resource while it is online, will pacemaker restart it to effect the changes? I would expect that if it does, it will necessitate restarting all depend

Re: [Pacemaker] stonith on node-add

2013-01-30 Thread Matthew O'Connor
Ah, very good - thank you so much!! On 01/30/2013 05:36 PM, Andreas Kurz wrote: > On 2013-01-30 20:51, Matthew O'Connor wrote: >> Hi! I must be doing something stupidly wrong... every time I add a new >> node to my live cluster, the first thing the cluster decides to do is

[Pacemaker] stonith on node-add

2013-01-30 Thread Matthew O'Connor
ently running (sadly) Pacemaker 1.1.5. It's not a big deal, just inconvenient, though it disturbs me regarding the stability of the other cluster nodes - not that they go down, but I want to know that what I'm doing isn't putting them at risk, either. Thanks!! -- Thank you! Mat

Re: [Pacemaker] killproc not found? o2cb shutdown via resource agent

2012-11-09 Thread Matthew O'Connor
On 11/09/2012 04:26 PM, Andrew Beekhof wrote: > On Fri, Nov 9, 2012 at 4:43 PM, Matthew O'Connor wrote: >> On 11/08/2012 08:15 PM, Andrew Beekhof wrote: >>> You're not starting it as a pacemaker resource are you? >>> CMAN should be doing that as part of th

Re: [Pacemaker] killproc not found? o2cb shutdown via resource agent

2012-11-08 Thread Matthew O'Connor
fs2_controld if the ocfs2 modules were still loaded and configfs was still alive and well...though technically it failed every time it tried. > > On Fri, Nov 9, 2012 at 11:14 AM, Matthew O'Connor wrote: >> I'm honestly beginning to wonder what exactly that killproc does for th

Re: [Pacemaker] killproc not found? o2cb shutdown via resource agent

2012-11-08 Thread Matthew O'Connor
gic wrote: > Hi, > > On Thu, Nov 08, 2012 at 08:23:53PM +1100, Tim Serong wrote: >> On 11/08/2012 07:56 PM, Andrew Beekhof wrote: >>> On Thu, Nov 8, 2012 at 5:16 PM, Tim Serong wrote: >>>> On 11/08/2012 12:11 PM, Andrew Beekhof wrote: >>>&

Re: [Pacemaker] killproc not found? o2cb shutdown via resource agent

2012-11-07 Thread Matthew O'Connor
art of the /usr/lib/ocf/resource.d/heartbeat/.ocf-shellfuncs file makes the process-kill work, but I suspect this is not the most desirable solution. Any thoughts on the consequences of killproc not functioning? Any suggestions on a cleaner way to fix this? Thanks again! On 11/07/2012 05:43 P

[Pacemaker] killproc not found? o2cb shutdown via resource agent

2012-11-07 Thread Matthew O'Connor
is is that ocfs2_controld.cman is still running. Thanks! -- Thank you! Matthew O'Connor (GPG Key ID: 55F981C4) CONFIDENTIAL NOTICE: The information contained in this electronic message is legally privileged, confidential and exempt from disclosure under applicable law. It is intended only

Re: [Pacemaker] Build dlm_controld for pacemakerstack(dlm_controld.pcmk)

2012-11-05 Thread Matthew O'Connor
Thank you for your help!! On 11/05/2012 12:43 AM, Andrew Beekhof wrote: > On Sat, Nov 3, 2012 at 2:43 PM, Matthew O'Connor wrote: >> On 10/29/2012 08:14 PM, Andrew Beekhof wrote: >>> Option 3 is where things are headed, however the only distro that ships it >>>

Re: [Pacemaker] Build dlm_controld for pacemaker stack(dlm_controld.pcmk)

2012-11-02 Thread Matthew O'Connor
On 10/29/2012 08:14 PM, Andrew Beekhof wrote: > Option 3 is where things are headed, however the only distro that ships it > today is Fedora-17 (and shortly 18). > In this scenario, all components obtain membership and quorum directly from > corosync. > So far OCFS2 is the only component that ha

Re: [Pacemaker] Configuring a cluster for asymmetric operation

2012-06-08 Thread Matthew O'Connor
labs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- Sincerely, Matthew O'Connor ---

Re: [Pacemaker] Announce: pcs / pcs-gui (Pacemaker/CorosyncConfiguration System)

2012-06-01 Thread Matthew O'Connor
ea. > > Just my two cents, of course, and if people speak up and say they hate > the existing shell and this effort solves their problems, then I'm all > for choice. But I can't recall hearing that from users. > > Cheers, > Florian > > _

Re: [Pacemaker] Seems to be working but fails to transition to othernode.

2012-05-30 Thread Matthew O'Connor
ww.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Sincerely, Matthew O'Connor - Sr. Software Engineer PGP/GPG Key: 0x55F981C4 Fingerprint: E5DC A0F8 5

Re: [Pacemaker] benefits of cman?

2012-05-23 Thread Matthew O'Connor
On 5/22/2012 9:19 PM, Andrew Beekhof wrote: > On Wed, May 23, 2012 at 10:54 AM, Matthew O'Connor wrote: >> Hi! >> >> On 05/22/2012 06:30 PM, Andrew Beekhof wrote: >>>>> but pacemaker dies a horrible >>>>> death when I put nodes int

Re: [Pacemaker] benefits of cman?

2012-05-22 Thread Matthew O'Connor
Hi! On 05/22/2012 06:30 PM, Andrew Beekhof wrote: >> > but pacemaker dies a horrible >> > death when I put nodes into standby (not necessarily cman-related, I >> > realize). > Um, that shouldn't happen. Did you file a bug for that? > Not yet. I will do so after I have some more data on the exact

Re: [Pacemaker] benefits of cman?

2012-05-22 Thread Matthew O'Connor
On 05/22/2012 03:30 AM, Andrew Beekhof wrote: > We were talking about GFS2 and Pacemaker but the same applies to OCFS2. > If you're just using ocfs2 there is no need for cman. But if you want > ocfs2 /and/ a cluster manager - you want them all using the same > membership and quorum data. > Yes, I

Re: [Pacemaker] question about stonith:external/libvirt

2012-05-21 Thread Matthew O'Connor
On 05/21/2012 02:26 PM, Florian Haas wrote: > On Mon, May 21, 2012 at 8:14 PM, Matthew O'Connor wrote: >> On 05/21/2012 05:43 AM, Florian Haas wrote: >>> Does it have "fencing resource-and-stonith" in the DRBD configuration, >>> and stonith_admin-fence-

Re: [Pacemaker] question about stonith:external/libvirt

2012-05-21 Thread Matthew O'Connor
On 05/21/2012 05:43 AM, Florian Haas wrote: > Does it have "fencing resource-and-stonith" in the DRBD configuration, > and stonith_admin-fence-peer.sh as its fence-peer handler? That was the problem. Totally forgot to update my DRBD configuration. For sake of testing, I used the "crm-fence-peer.s

[Pacemaker] question about stonith:external/libvirt

2012-05-19 Thread Matthew O'Connor
After using the tutorial on the Hastexo site for setting up stonith via libvirt, I believe I have it working correctly...but...some strange things are happening. I have two nodes, with shared storage provided by a dual-primary DRBD resource and OCFS2. Here is one of my stonith primitives: p

Re: [Pacemaker] benefits of cman?

2012-05-18 Thread Matthew O'Connor
OK, I answered my own question below...for the most part. On 05/18/2012 02:26 PM, Matthew O'Connor wrote: By the way, will Pacemaker or Corosync log something to the syslog if it decides to fence a member? Will it attempt to fence one that has flat disappeared, or only one that it has b

Re: [Pacemaker] benefits of cman?

2012-05-18 Thread Matthew O'Connor
apply the new cluster.conf? > I don't believe so. I think you need the 'cman_tool join' command. Awesome - I was hoping it would be that easy! Thanks for the help!! -- Sincerely, Matthew O'Connor - Sr. S

[Pacemaker] benefits of cman?

2012-05-17 Thread Matthew O'Connor
Are there any detriments to using cman? If I want to add additional nodes to the cluster, will I need to bring down the whole cluster or restart services on existing nodes to apply the new cluster.conf? Thanks!! -- Sincerely, Matthew O&#

Re: [Pacemaker] ocfs2_controld.pcmk process issue

2012-05-16 Thread Matthew O'Connor
Great! Thanks for the info+links!! On 5/16/2012 12:20 AM, Tim Serong wrote: > On 05/16/2012 12:42 PM, Matthew O'Connor wrote: >> I'm sorry, no. It's on Ubuntu 11.10... I was looking into grabbing a >> copy of the SUSE community dvd iso the other night - w

Re: [Pacemaker] ocfs2_controld.pcmk process issue

2012-05-15 Thread Matthew O'Connor
ue consistently, and among at least two distributions. On 5/15/2012 8:34 PM, Andrew Beekhof wrote: > Is this on SLES by any chance? > SUSE are about the only ones with knowledge in this area I'm afraid. > > On Tue, May 15, 2012 at 6:01 AM, Matthew O'Connor wrote:

[Pacemaker] ocfs2_controld.pcmk process issue

2012-05-14 Thread Matthew O'Connor
ontrold RA. Could that have possibly triggered this issue? -- Sincerely, Matthew O'Connor - Sr. Software Engineer PGP/GPG Key: 0x55F981C4 Fingerprint: E5DC A0F8 5A40 E4DA 2CE6 B5A2 014C 2CBF 55F9 81C4 Engineering and Computer Simula

Re: [Pacemaker] new node causes spurious evil

2012-05-14 Thread Matthew O'Connor
Hi! Thanks for your reply! That makes perfect sense. Thanks again!! -- Matt On 5/14/2012 10:44 AM, David Vossel wrote: > - Original Message - >> From: "Matthew O'Connor" >> To: "The Pacemaker cluster resource manager" >> Sent: Friday, M

[Pacemaker] new node causes spurious evil

2012-05-11 Thread Matthew O'Connor
My question: Why will a node that is not allowed to start a resource attempt to start a monitor on that resource? Is there a way to change this behavior? (Specific monitors in question: ocf:heartbeat:iSCSITarget and ocf:heartbeat:iSCSILogicalUnit) The details: I have two nodes, ds01 and ds0

Re: [Pacemaker] corosync, ocfs2_controld.pcmk insane?

2012-05-11 Thread Matthew O'Connor
On 05/11/2012 12:54 PM, Lars Marowsky-Bree wrote: Which of course you can't do if you have actually any OCFS2 file systems mounted; that'd result in an immediate suicide of the node. Indeed. :) Happily, I've not created any on this cluster yet. But yes, that is certainly no solution. I had

Re: [Pacemaker] corosync, ocfs2_controld.pcmk insane?

2012-05-11 Thread Matthew O'Connor
Just a quick update - kill -9 on the ocfs2_controld.pcmk process silenced the issue. Corosync is quiet again. Logs reported that syslogd was rate-limiting incoming log messages from ocfs2_controld. On 05/11/2012 10:19 AM, Matthew O'Connor wrote: This might be the wrong place to ask

[Pacemaker] corosync, ocfs2_controld.pcmk insane?

2012-05-11 Thread Matthew O'Connor
This might be the wrong place to ask this question, but between corosync, OCFS2 and Pacemaker, under what circumstances would corosync start consuming mass quantities of CPU? It started last night around 30%, and this morning is up around 110%. 40% is user, 47% system. ocfs2_controld.pcmk is