Re: [Pacemaker] Upstart resources

2012-03-12 Thread Jake Smith
- Original Message -
> From: "Andreas Ntaflos" 
> To: "The Pacemaker cluster resource manager" 
> Sent: Wednesday, February 29, 2012 1:38:08 PM
> Subject: Re: [Pacemaker] Upstart resources
> 
> On 27/02/12 13:09, Ante Karamatić wrote:
> > On 27.02.2012 12:27, Florian Haas wrote:
> > 
> >> Alas, to the best of my knowledge the only way to change a
> >> specific
> >> job's respawn policy is by modifying its job definition. Likewise,
> >> that's the only way to enable or disable starting on system boot.
> >> So
> >> there is a way to overrule the package maintainer's default --
> >> hacking
> >> the job definition.
> > 
> > I've explained '(no)respawn' in the other mail. Manual
> > starting/stopping
> > is done by:
> > 
> > echo 'manual' >> /etc/init/${service}.override
> > 
> > That's all you need to forbid automatic starting or stopping the
> > service.
> 
> Does this work in Ubuntu 10.04? As far as I remember the discussion
> on
> this problem in Launchpad, the consensus was something like "too late
> for Lucid".

From what I read (upstart.ubuntu.com/cookbook) when I needed to disable Upstart 
jobs in 10.04:

With version 0.6.7 - rename job config to not end with ".conf" or comment 
"start on" line with #
^^^ this is a little misleading since 10.04 (Lucid) has v0.6.5-8 but the above 
seems to work fine

The ".override" file option described above is listed as available since v1.3 
(this option seems much clearer/cleaner/obvious). Version 1.3 isn't released in 
Ubuntu until 11.10 (Oneiric) according to packages.ubuntu.com

Unless I'm missing something...

Jake

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-29 Thread Andreas Ntaflos
On 27/02/12 13:09, Ante Karamatić wrote:
> On 27.02.2012 12:27, Florian Haas wrote:
> 
>> Alas, to the best of my knowledge the only way to change a specific
>> job's respawn policy is by modifying its job definition. Likewise,
>> that's the only way to enable or disable starting on system boot. So
>> there is a way to overrule the package maintainer's default -- hacking
>> the job definition.
> 
> I've explained '(no)respawn' in the other mail. Manual starting/stopping
> is done by:
> 
> echo 'manual' >> /etc/init/${service}.override
> 
> That's all you need to forbid automatic starting or stopping the service.

Does this work in Ubuntu 10.04? As far as I remember the discussion on
this problem in Launchpad, the consensus was something like "too late
for Lucid".

Andreas



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-28 Thread Andrew Beekhof
On Wed, Feb 29, 2012 at 6:38 PM, Florian Haas  wrote:
> On Wed, Feb 29, 2012 at 8:21 AM, Andrew Beekhof  wrote:
>> 2012/2/27 Ante Karamatić :
>>> On 27.02.2012 12:27, Florian Haas wrote:
>>>
 Alas, to the best of my knowledge the only way to change a specific
 job's respawn policy is by modifying its job definition. Likewise,
 that's the only way to enable or disable starting on system boot. So
 there is a way to overrule the package maintainer's default -- hacking
 the job definition.
>>>
>>> I've explained '(no)respawn' in the other mail. Manual starting/stopping
>>> is done by:
>>>
>>> echo 'manual' >> /etc/init/${service}.override
>>>
>>> That's all you need to forbid automatic starting or stopping the service.
>>>
>>
>> Not really appropriate for a cluster daemon to be doing though IMHO
>
> Of course it wouldn't be the cluster daemon doing this but the admin,

I think the number of people that would think to do this after adding
an upstart service to the cluster approaches zero.

> but how is that so fundamentally worse
> compared to doing, say "chkconfig mysql off"?
>
> Having to keep a service from starting a daemon on boot is something
> that is fairly standard in Pacemaker environments these days.

But this isn't "start at boot", this is preventing respawn after
pacemaker starts it.

>
> Florian
>
> --
> Need help with High Availability?
> http://www.hastexo.com/now
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-28 Thread Florian Haas
On Wed, Feb 29, 2012 at 8:21 AM, Andrew Beekhof  wrote:
> 2012/2/27 Ante Karamatić :
>> On 27.02.2012 12:27, Florian Haas wrote:
>>
>>> Alas, to the best of my knowledge the only way to change a specific
>>> job's respawn policy is by modifying its job definition. Likewise,
>>> that's the only way to enable or disable starting on system boot. So
>>> there is a way to overrule the package maintainer's default -- hacking
>>> the job definition.
>>
>> I've explained '(no)respawn' in the other mail. Manual starting/stopping
>> is done by:
>>
>> echo 'manual' >> /etc/init/${service}.override
>>
>> That's all you need to forbid automatic starting or stopping the service.
>>
>
> Not really appropriate for a cluster daemon to be doing though IMHO

Of course it wouldn't be the cluster daemon doing this but the admin,
but how is that so fundamentally worse
compared to doing, say "chkconfig mysql off"?

Having to keep a service from starting a daemon on boot is something
that is fairly standard in Pacemaker environments these days.

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-28 Thread Florian Haas
2012/2/27 Ante Karamatić :
> On 27.02.2012 12:27, Florian Haas wrote:
>
>> Alas, to the best of my knowledge the only way to change a specific
>> job's respawn policy is by modifying its job definition. Likewise,
>> that's the only way to enable or disable starting on system boot. So
>> there is a way to overrule the package maintainer's default -- hacking
>> the job definition.
>
> I've explained '(no)respawn' in the other mail. Manual starting/stopping
> is done by:
>
> echo 'manual' >> /etc/init/${service}.override
>
> That's all you need to forbid automatic starting or stopping the service.

Oh thanks! I didn't know that, much to my dismay.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-28 Thread Andrew Beekhof
2012/2/27 Ante Karamatić :
> On 27.02.2012 12:27, Florian Haas wrote:
>
>> Alas, to the best of my knowledge the only way to change a specific
>> job's respawn policy is by modifying its job definition. Likewise,
>> that's the only way to enable or disable starting on system boot. So
>> there is a way to overrule the package maintainer's default -- hacking
>> the job definition.
>
> I've explained '(no)respawn' in the other mail. Manual starting/stopping
> is done by:
>
> echo 'manual' >> /etc/init/${service}.override
>
> That's all you need to forbid automatic starting or stopping the service.
>

Not really appropriate for a cluster daemon to be doing though IMHO

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-28 Thread Ante Karamatić
On 27.02.2012 12:27, Florian Haas wrote:

> Alas, to the best of my knowledge the only way to change a specific
> job's respawn policy is by modifying its job definition. Likewise,
> that's the only way to enable or disable starting on system boot. So
> there is a way to overrule the package maintainer's default -- hacking
> the job definition.

I've explained '(no)respawn' in the other mail. Manual starting/stopping
is done by:

echo 'manual' >> /etc/init/${service}.override

That's all you need to forbid automatic starting or stopping the service.

-- 
Ante Karamatic 
Professional and Engineering Services
Canonical Ltd

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-27 Thread Andrew Beekhof
On Mon, Feb 27, 2012 at 11:07 PM, Ante Karamatic  wrote:
> On 27.02.2012 11:37, Andrew Beekhof wrote:
>
>> I know, but whatever the admin specifies should over-rule the package
>> maintainer's defaults.
>
> I agree and that's how upstart works.
>
>> From what you're saying, this is not possible with Upstart.  Which is bad.
>
> I said it's an option, not that it's not possible :) /etc/init/ssh.conf
> defines ssh service/job. /etc/init/ssh.override is an override file that
> overrides everything from .conf. There is a catch, and that's that
> there's no 'norespawn' option.
>
> So, if a package maintainer defined 'respawn' in .conf, you can't really
> disable it without removing 'respawn' from job file. On the other hand,
> you can override default respawn behavior (ssh service):
>
> echo "respawn limit 1 1" >> /etc/init/ssh.override

Yep, basically what systemd wanted us to do.
The problem is that if you take that service away from the cluster
(permanently or otherwise), the override to make pacemaker happy
shouldn't be applied.

So apart from it being horribly ugly, it doesn't give the right behaviour.
Only when some higher level daemon starts the service and knows
otherwise should norespawn be active.

>
> That would stop respawning if service fails more than once within the
> second.
>
> I'd say that non-existing 'norespawn' option is a bug.
>
> Best regards
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-27 Thread Ante Karamatic
On 27.02.2012 11:37, Andrew Beekhof wrote:

> I know, but whatever the admin specifies should over-rule the package
> maintainer's defaults.

I agree and that's how upstart works.

> From what you're saying, this is not possible with Upstart.  Which is bad.

I said it's an option, not that it's not possible :) /etc/init/ssh.conf
defines ssh service/job. /etc/init/ssh.override is an override file that
overrides everything from .conf. There is a catch, and that's that
there's no 'norespawn' option.

So, if a package maintainer defined 'respawn' in .conf, you can't really
disable it without removing 'respawn' from job file. On the other hand,
you can override default respawn behavior (ssh service):

echo "respawn limit 1 1" >> /etc/init/ssh.override

That would stop respawning if service fails more than once within the
second.

I'd say that non-existing 'norespawn' option is a bug.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-27 Thread Andrew Beekhof
On Mon, Feb 27, 2012 at 10:27 PM, Florian Haas  wrote:
> On 02/27/12 11:37, Andrew Beekhof wrote:
>> On Sun, Feb 26, 2012 at 8:54 AM, Ante Karamatic  wrote:
>>> On 23.02.2012 23:52, Andrew Beekhof wrote:
>>>
 On Thu, Feb 23, 2012 at 6:43 PM, Ante Karamatic  wrote:
> Well... Upstart actually does notice if the job failed and respawns it -
> depending on job's configuration.

 Actually this is /really/ bad as it subverts our recovery policies.
 Restarting on the local machine is not the only option.
>>>
>>> It's an option. If you add 'respawn' to upstart job, it will respawn on
>>> failure.
>>
>> I know, but whatever the admin specifies should over-rule the package
>> maintainer's defaults.
>> From what you're saying, this is not possible with Upstart.  Which is bad.
>
> Alas, to the best of my knowledge the only way to change a specific
> job's respawn policy is by modifying its job definition. Likewise,
> that's the only way to enable or disable starting on system boot. So
> there is a way to overrule the package maintainer's default -- hacking
> the job definition.

This was the path the systemd guys tried to send us down too.
I was able to bring them around in the end.

>
> All of which isn't exactly pretty. What you could say in the Upstart
> folks' defense is that the job definitions themselves are at least
> always defined as config files in the .deb packages, so they won't get
> clobbered on upgrades.
>
> Cheers,
> Florian
>
> --
> Need help with High Availability?
> http://www.hastexo.com/now
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-27 Thread Florian Haas
On 02/27/12 11:37, Andrew Beekhof wrote:
> On Sun, Feb 26, 2012 at 8:54 AM, Ante Karamatic  wrote:
>> On 23.02.2012 23:52, Andrew Beekhof wrote:
>>
>>> On Thu, Feb 23, 2012 at 6:43 PM, Ante Karamatic  wrote:
 Well... Upstart actually does notice if the job failed and respawns it -
 depending on job's configuration.
>>>
>>> Actually this is /really/ bad as it subverts our recovery policies.
>>> Restarting on the local machine is not the only option.
>>
>> It's an option. If you add 'respawn' to upstart job, it will respawn on
>> failure.
> 
> I know, but whatever the admin specifies should over-rule the package
> maintainer's defaults.
> From what you're saying, this is not possible with Upstart.  Which is bad.

Alas, to the best of my knowledge the only way to change a specific
job's respawn policy is by modifying its job definition. Likewise,
that's the only way to enable or disable starting on system boot. So
there is a way to overrule the package maintainer's default -- hacking
the job definition.

All of which isn't exactly pretty. What you could say in the Upstart
folks' defense is that the job definitions themselves are at least
always defined as config files in the .deb packages, so they won't get
clobbered on upgrades.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-27 Thread Andrew Beekhof
On Sun, Feb 26, 2012 at 8:54 AM, Ante Karamatic  wrote:
> On 23.02.2012 23:52, Andrew Beekhof wrote:
>
>> On Thu, Feb 23, 2012 at 6:43 PM, Ante Karamatic  wrote:
>>> Well... Upstart actually does notice if the job failed and respawns it -
>>> depending on job's configuration.
>>
>> Actually this is /really/ bad as it subverts our recovery policies.
>> Restarting on the local machine is not the only option.
>
> It's an option. If you add 'respawn' to upstart job, it will respawn on
> failure.

I know, but whatever the admin specifies should over-rule the package
maintainer's defaults.
>From what you're saying, this is not possible with Upstart.  Which is bad.

> There are other options too.
>
> http://upstart.ubuntu.com/cookbook/#respawn
>
> Take care
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-25 Thread Ante Karamatic
On 23.02.2012 23:52, Andrew Beekhof wrote:

> On Thu, Feb 23, 2012 at 6:43 PM, Ante Karamatic  wrote:
>> Well... Upstart actually does notice if the job failed and respawns it -
>> depending on job's configuration.
> 
> Actually this is /really/ bad as it subverts our recovery policies.
> Restarting on the local machine is not the only option.

It's an option. If you add 'respawn' to upstart job, it will respawn on
failure. There are other options too.

http://upstart.ubuntu.com/cookbook/#respawn

Take care

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-23 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 6:43 PM, Ante Karamatic  wrote:
> On 23.02.2012 07:57, Vladislav Bogdanov wrote:
>
>> Thanks for clarification, that wasn't clear at the moment I looked at
>> it. If I knew that, I wouldn't write that RA. One remark, my RA has
>> possibility to check service aliveness on monitor operation and repair
>> that service if it hangs.
>
> Well... Upstart actually does notice if the job failed and respawns it -
> depending on job's configuration.

Actually this is /really/ bad as it subverts our recovery policies.
Restarting on the local machine is not the only option.

For systemd we have asked for a way to programatically disable the
automated respawning.

Although I can't stop it, individual agents shouldn't be doing this either.

> Monitoring cluster resource, in this
> case, should just return 'running' or 'not running'. It's up to the lrmd
> to restart the resource if it's not running. Restarting the resource
> within the 'monitor' doesn't look like the best way to do it? It somehow
> doesn't fit into the 'monitor' function and you lose some of the
> functionality when you don't report the problem to the lrmd (allowed
> number of restarts; what to do if monitor fails, etc...).
>
>> I use it for libvirtd which sometimes become
>> unresponsive so I need to restart it before all other libvirt-related
>> resources begin to fail. Fortunately, modern libvirtd can be restarted
>> without affecting guests. Of course, that is just a hack, and that
>> should be fixed in libvirtd, but we live in a real world...
>
> You can prevent other resources from restarting by adjusting
> constraints. But this really depends on your setup. For some time
> running libvirtd is not a requirement for running a VM. I don't recall
> VMs ever failing if libvirt restarted.
>
> Best regards
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-23 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 6:31 PM, Ante Karamatic  wrote:
> On 23.02.2012 00:10, Andrew Beekhof wrote:
>
>> Do you still have LSB scripts on a machine thats using upstart?
>
> Yes, some LSB scripts can't be easily converted to upstart jobs. Or,
> let's rephrase that - can't be converted to upstart jobs without losing
> some of the functionality.
>
>> On fedora they purged them all.
>
> All? Even the stuff like drbd? I have to take a look at that.

I think any package that doesn't have a unit file is going to be
blacklisted from F-17.
That was the threat at least.

>
>> At any rate, I'm inclined to think that we're making an unnecessary
>> differentiation.
>> When we write LSB, we really mean "system services" and don't much
>> care if that means a SYS-V script, Upstart job or Systemd unit.
>
> I agree.
>
>> It also makes running mixed clusters needlessly painful.
>
> But then again, distributions have different names for the same service.
> For example, apache2 on debian/ubuntu, httpd on (all?) RPM distributions.

I think those are the exception.  For the most part they have the same name.
At least we wouldn't be adding additional pointless incompatibilities :-)

>> Should we not treat LSB (or some new name) as an alias for whatever
>> flavour of init is currently in vogue for that distro?
>> Doesn't seem like it would be terribly hard to work out on the fly.
>
> Therefore the question about initctl few days ago? :)

'service' yeah :-)

> I do like the
> idea. Some distributions provide multiple different init systems;
> switching between those would be much easier in that case.
>
> Best regards
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-23 Thread Vladislav Bogdanov
23.02.2012 10:43, Ante Karamatic wrote:
> On 23.02.2012 07:57, Vladislav Bogdanov wrote:
> 
>> Thanks for clarification, that wasn't clear at the moment I looked at
>> it. If I knew that, I wouldn't write that RA. One remark, my RA has
>> possibility to check service aliveness on monitor operation and repair
>> that service if it hangs.
> 
> Well... Upstart actually does notice if the job failed and respawns it -
> depending on job's configuration. Monitoring cluster resource, in this
> case, should just return 'running' or 'not running'. It's up to the lrmd
> to restart the resource if it's not running. Restarting the resource
> within the 'monitor' doesn't look like the best way to do it? It somehow
> doesn't fit into the 'monitor' function and you lose some of the
> functionality when you don't report the problem to the lrmd (allowed
> number of restarts; what to do if monitor fails, etc...).

Well, monitor failure will cause all dependent resources to be restarted
by pacemaker, which is not always desired.
As some resources (like libvirtd or iscsid or ietd) support restarts
without affecting functionality at all, I prefer them to be restarted
automatically by upstart, not by pacemaker. That's why I use 'respawn'
there. Of course not all resources support that.

What I said above is not about resource NOT_RUNNING failure, but about
HANG failure. Imagine daemon which still runs (has a process) but does
not answer to requests. That is not notified by upstart. But in a case
of libvirtd that will be notified by VirtualDomain RA and will cause
monitor ERR_GENERIC (if I recall correctly) failure. VM then will be
scheduled to restart. Then it fails on stop because libvirtd still
doesn't answer, then node is fenced.

I was hit by this once, and that was a simple growth problem - libvirtd
has a limit on a number of connections. More resources (VMs) you have,
bigger the chance that you consume all connection slots for monitor
operations.

And I think that having libvirtd killed -9 by its RA on monitor (and
respawned by upstart) is a way less evil than to have whole cluster
forcibly restarted. Yes, this is a hack. But it works and allows me to
sleep.

Of course that does not replace need in a proper configuration, just a
one more safety layer...

> 
>> I use it for libvirtd which sometimes become
>> unresponsive so I need to restart it before all other libvirt-related
>> resources begin to fail. Fortunately, modern libvirtd can be restarted
>> without affecting guests. Of course, that is just a hack, and that
>> should be fixed in libvirtd, but we live in a real world...
> 
> You can prevent other resources from restarting by adjusting
> constraints. But this really depends on your setup. For some time
> running libvirtd is not a requirement for running a VM. I don't recall
> VMs ever failing if libvirt restarted.

I know.

But libvirtd is required to start/stop a libvirt-managed VM. That's why
one needs a constraints to colocate VM with libvirtd instance.
It is currently impossible to specify that something is needed to
start/stop resource but is not needed while it runs (btw in the case of
libvirtd it *is* needed to obtain resource status).
So constraints must be there.
But then, if pacemaker notifies that resource (libvirtd) is not running
it will stop all dependent resources (VMs) and then restart failed one.
And it will fail to stop that resources (because libvirtd is still not
running) and node will be fenced.

Vladislav

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 23.02.2012 07:57, Vladislav Bogdanov wrote:

> Thanks for clarification, that wasn't clear at the moment I looked at
> it. If I knew that, I wouldn't write that RA. One remark, my RA has
> possibility to check service aliveness on monitor operation and repair
> that service if it hangs.

Well... Upstart actually does notice if the job failed and respawns it -
depending on job's configuration. Monitoring cluster resource, in this
case, should just return 'running' or 'not running'. It's up to the lrmd
to restart the resource if it's not running. Restarting the resource
within the 'monitor' doesn't look like the best way to do it? It somehow
doesn't fit into the 'monitor' function and you lose some of the
functionality when you don't report the problem to the lrmd (allowed
number of restarts; what to do if monitor fails, etc...).

> I use it for libvirtd which sometimes become
> unresponsive so I need to restart it before all other libvirt-related
> resources begin to fail. Fortunately, modern libvirtd can be restarted
> without affecting guests. Of course, that is just a hack, and that
> should be fixed in libvirtd, but we live in a real world...

You can prevent other resources from restarting by adjusting
constraints. But this really depends on your setup. For some time
running libvirtd is not a requirement for running a VM. I don't recall
VMs ever failing if libvirt restarted.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 23.02.2012 00:10, Andrew Beekhof wrote:

> Do you still have LSB scripts on a machine thats using upstart?

Yes, some LSB scripts can't be easily converted to upstart jobs. Or,
let's rephrase that - can't be converted to upstart jobs without losing
some of the functionality.

> On fedora they purged them all.

All? Even the stuff like drbd? I have to take a look at that.

> At any rate, I'm inclined to think that we're making an unnecessary
> differentiation.
> When we write LSB, we really mean "system services" and don't much
> care if that means a SYS-V script, Upstart job or Systemd unit.

I agree.

> It also makes running mixed clusters needlessly painful.

But then again, distributions have different names for the same service.
For example, apache2 on debian/ubuntu, httpd on (all?) RPM distributions.

> Should we not treat LSB (or some new name) as an alias for whatever
> flavour of init is currently in vogue for that distro?
> Doesn't seem like it would be terribly hard to work out on the fly.

Therefore the question about initctl few days ago? :) I do like the
idea. Some distributions provide multiple different init systems;
switching between those would be much easier in that case.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Vladislav Bogdanov
22.02.2012 22:44, Ante Karamatic wrote:
> On 22.02.2012 19:45, Vladislav Bogdanov wrote:
> 
>> I looked at that RAexec very early, just after it was commited, and I
>> understand that it requires running dbus daemon to operate. I prefer to
>> simplify operation chains, so that is really not an option for me to
>> rely on one more service to do simple job.
> 
> It never required dbus running. It uses the same API as initctl -
> libdbus. dbus daemon has nothing to do with IPC; it's mostly used for
> 'broadcasting' hardware changes.

Thanks for clarification, that wasn't clear at the moment I looked at
it. If I knew that, I wouldn't write that RA. One remark, my RA has
possibility to check service aliveness on monitor operation and repair
that service if it hangs. I use it for libvirtd which sometimes become
unresponsive so I need to restart it before all other libvirt-related
resources begin to fail. Fortunately, modern libvirtd can be restarted
without affecting guests. Of course, that is just a hack, and that
should be fixed in libvirtd, but we live in a real world...

Best,
Vladislav

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 5:01 AM, Ante Karamatic  wrote:
> On 21.02.2012 17:56, Jake Smith wrote:
>
>> Np thanks for adding - I was asking because the exit codes on the
>> upstart jobs I'm using didn't align (I believe) with the LSB spec and
>> I wasn't sure if they were supposed to.  Haven't really seen the
>> level of documentation (as LSB resource) for upstart resources.  I
>> would assume since they are Ubuntu specific.  Mine seems to be
>> working OK so far but haven't stressed it yet.
>
> Upstart always returns exit code 0. That's the reason why we couldn't
> just use LSB RA for upstart jobs. As for documentation, it's really like
> using LSB, just type in 'upstart' instead of 'lsb.
>
> Example (ssh):
>
> ssh upstart job is defined by /etc/init/ssh.conf. So, your primitive in
> cluster would start as:
>
> primitive ssh upstart:ssh \
>        op ... \
>        meta ... \
>
> Note that you might have LSB scripts still, like apache
> (/etc/init.d/apache2). Those are not upstart jobs, so you just define
> those as LSB:
>
> primitive apache2 lsb:apache2 \
>        op ... \
>        meta ... \
>
> Let me know if you have any other questions.

Do you still have LSB scripts on a machine thats using upstart?
On fedora they purged them all.

At any rate, I'm inclined to think that we're making an unnecessary
differentiation.
When we write LSB, we really mean "system services" and don't much
care if that means a SYS-V script, Upstart job or Systemd unit.

It also makes running mixed clusters needlessly painful.
Should we not treat LSB (or some new name) as an alias for whatever
flavour of init is currently in vogue for that distro?
Doesn't seem like it would be terribly hard to work out on the fly.

>
> Best regards
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Jake Smith

- Original Message -
> From: "Ante Karamatic" 
> To: pacemaker@oss.clusterlabs.org
> Sent: Wednesday, February 22, 2012 1:26:56 PM
> Subject: Re: [Pacemaker] Upstart resources
> 
> On 21.02.2012 16:30, Florian Haas wrote:
> 
> > [1] "Why do I need to use a PPA if this release is ostensibly on
> > long-term support?" Don't ask me, ask someone from Canonical. :)
> 
> Pacemaker isn't really supported cluster stack in Ubuntu 10.04 (it's
> in
> universe). RHCS is in main and supported. There are multiple reasons
> why
> pacemaker isn't in main for 10.04, but we did our best effort to
> provide
> another channel to get usable pacemaker for 10.04 - and that's PPA.

All I can say is (though it would be nice to have Pacemaker in main on 10.04) I 
am glad that the PPA exists and *is* well maintained.  It might not be main but 
it's much easier to use one PPA than build everything from source on my own!  
Plus it's easier to get some important package updates than an SRU against the 
main or waiting for the next LTS.

So Thanks!

Jake

> 
> Other than that, PPA provides backported packages from newer Ubuntu
> releases. In addition, 'upstart' RA was developed ~6 months after
> 10.04
> was already released.
> 
> Best regards
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 22.02.2012 19:45, Vladislav Bogdanov wrote:

> I looked at that RAexec very early, just after it was commited, and I
> understand that it requires running dbus daemon to operate. I prefer to
> simplify operation chains, so that is really not an option for me to
> rely on one more service to do simple job.

It never required dbus running. It uses the same API as initctl -
libdbus. dbus daemon has nothing to do with IPC; it's mostly used for
'broadcasting' hardware changes.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Jake Smith

- Original Message -
> From: "Ante Karamatic" 
> To: pacemaker@oss.clusterlabs.org
> Sent: Wednesday, February 22, 2012 1:01:35 PM
> Subject: Re: [Pacemaker] Upstart resources
> 
> On 21.02.2012 17:56, Jake Smith wrote:
> 
> > Np thanks for adding - I was asking because the exit codes on the
> > upstart jobs I'm using didn't align (I believe) with the LSB spec
> > and
> > I wasn't sure if they were supposed to.  Haven't really seen the
> > level of documentation (as LSB resource) for upstart resources.  I
> > would assume since they are Ubuntu specific.  Mine seems to be
> > working OK so far but haven't stressed it yet.
> 
> Upstart always returns exit code 0. That's the reason why we couldn't
> just use LSB RA for upstart jobs. As for documentation, it's really
> like
> using LSB, just type in 'upstart' instead of 'lsb.
> 
> Example (ssh):
> 
> ssh upstart job is defined by /etc/init/ssh.conf. So, your primitive
> in
> cluster would start as:
> 
> primitive ssh upstart:ssh \
>   op ... \
>   meta ... \
> 
> Note that you might have LSB scripts still, like apache
> (/etc/init.d/apache2). Those are not upstart jobs, so you just define
> those as LSB:
> 
> primitive apache2 lsb:apache2 \
>   op ... \
>   meta ... \
> 

I'm using both right now and they work fine (Ubuntu 10.04 LTS). I only asked 
the question initially because I have encountered a few LSB resources that I 
had to fix the init scripts to be LSB compliant and behave nicely with 
Pacemaker.  My interest was to verify that the upstart resources were compliant 
and I wouldn't have issues there.

> Let me know if you have any other questions.

Nope - Thanks!

Jake

> 
> Best regards
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Vladislav Bogdanov
22.02.2012 20:55, Ante Karamatic write:
> On 16.02.2012 05:23, Vladislav Bogdanov wrote:
> 
>> Newer versions of pacemaker and lrmd are able to deal with upstart
>> resources via dbus.
>> However I do not like this way, so please find resource-agent attached,
>> which is able to manage arbitrary upstart job (just like Anything but
>> for upstart resources). It already saved me much time and nerves
>> managing libvirtd (with my own upstart job) which you probably already
>> know always wants to SIGABRT (btw I even know the main reason for that
>> and now testing patch which I will hopefully send to libvirt ml).
> 
> But, you use 'initctl' in your OCF which also relies on dbus (libdbus,
> to be correct; in both cases running dbus is not a requirement). Reason
> why we didn't use initctl, grep, awk and company, when we developed the
> upstart RAexec, was cause doing it that way was slower. And ugly :)
> 
> I guess I'm just interested what do you find wrong or bad with how
> current upstart RAexec is done?

I looked at that RAexec very early, just after it was commited, and I
understand that it requires running dbus daemon to operate. I prefer to
simplify operation chains, so that is really not an option for me to
rely on one more service to do simple job.
Did I miss something?

Anyways, upstart is gone from Fedora, and will go away from EL7, so that
all actually doesn't matter.

Best,
Vladislav

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 21.02.2012 16:30, Florian Haas wrote:

> [1] "Why do I need to use a PPA if this release is ostensibly on
> long-term support?" Don't ask me, ask someone from Canonical. :)

Pacemaker isn't really supported cluster stack in Ubuntu 10.04 (it's in
universe). RHCS is in main and supported. There are multiple reasons why
pacemaker isn't in main for 10.04, but we did our best effort to provide
another channel to get usable pacemaker for 10.04 - and that's PPA.

Other than that, PPA provides backported packages from newer Ubuntu
releases. In addition, 'upstart' RA was developed ~6 months after 10.04
was already released.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 21.02.2012 17:56, Jake Smith wrote:

> Np thanks for adding - I was asking because the exit codes on the
> upstart jobs I'm using didn't align (I believe) with the LSB spec and
> I wasn't sure if they were supposed to.  Haven't really seen the
> level of documentation (as LSB resource) for upstart resources.  I
> would assume since they are Ubuntu specific.  Mine seems to be
> working OK so far but haven't stressed it yet.

Upstart always returns exit code 0. That's the reason why we couldn't
just use LSB RA for upstart jobs. As for documentation, it's really like
using LSB, just type in 'upstart' instead of 'lsb.

Example (ssh):

ssh upstart job is defined by /etc/init/ssh.conf. So, your primitive in
cluster would start as:

primitive ssh upstart:ssh \
op ... \
meta ... \

Note that you might have LSB scripts still, like apache
(/etc/init.d/apache2). Those are not upstart jobs, so you just define
those as LSB:

primitive apache2 lsb:apache2 \
op ... \
meta ... \

Let me know if you have any other questions.

Best regards

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-22 Thread Ante Karamatic
On 16.02.2012 05:23, Vladislav Bogdanov wrote:

> Newer versions of pacemaker and lrmd are able to deal with upstart
> resources via dbus.
> However I do not like this way, so please find resource-agent attached,
> which is able to manage arbitrary upstart job (just like Anything but
> for upstart resources). It already saved me much time and nerves
> managing libvirtd (with my own upstart job) which you probably already
> know always wants to SIGABRT (btw I even know the main reason for that
> and now testing patch which I will hopefully send to libvirt ml).

But, you use 'initctl' in your OCF which also relies on dbus (libdbus,
to be correct; in both cases running dbus is not a requirement). Reason
why we didn't use initctl, grep, awk and company, when we developed the
upstart RAexec, was cause doing it that way was slower. And ugly :)

I guess I'm just interested what do you find wrong or bad with how
current upstart RAexec is done?

Thanks

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-21 Thread Jake Smith
Florian,

Np thanks for adding - I was asking because the exit codes on the upstart jobs 
I'm using didn't align (I believe) with the LSB spec and I wasn't sure if they 
were supposed to.  Haven't really seen the level of documentation (as LSB 
resource) for upstart resources.  I would assume since they are Ubuntu 
specific.  Mine seems to be working OK so far but haven't stressed it yet. 

- Original Message -
> From: "Florian Haas" 
> To: "The Pacemaker cluster resource manager" 
> Sent: Tuesday, February 21, 2012 10:30:27 AM
> Subject: Re: [Pacemaker] Upstart resources
> 
> Jake,
> 
> sorry, I missed your original post due to travel; let me toss in one
> more thing here:
> 
> On Tue, Feb 21, 2012 at 3:32 PM, Jake Smith 
> wrote:
> >> > Are upstart jobs expected to conform to the LSB spec with
> >> > regards
> >> > to exit codes, etc?
> >> > Is there any reference documentation using upstart resources in
> >> > Pacemaker?
> >> > Or any good advice :-)
> >>
> >> Newer versions of pacemaker and lrmd are able to deal with upstart
> >> resources via dbus.
> 
> Only if the LRM is compiled with --enable-upstart, of course. Which,
> to the best of my knowledge, is only set on the Ubuntu builds (and
> Ubuntu builds are currently the only ones for which this makes sense
> to set, obviously).
> 
> This, however, requires that you run with an updated libglib2 package

Don't I know it - this one caused other issues for me too when it wasn't 
properly updated!

> (again, only on Ubuntu). All of that should be available either in
> the
> upstream Ubuntu repos or, for the current LTS, in the
> ubuntu-ha-maintainers PPA.[1]

LTS - yes, PPA - yes, ask why - only if it's missing on the next LTS! :-)

Jake 

> 
> Hope this helps.
> 
> Cheers,
> Florian
> 
> [1] "Why do I need to use a PPA if this release is ostensibly on
> long-term support?" Don't ask me, ask someone from Canonical. :)
> 
> --
> Need help with High Availability?
> http://www.hastexo.com/now
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-21 Thread Florian Haas
Jake,

sorry, I missed your original post due to travel; let me toss in one
more thing here:

On Tue, Feb 21, 2012 at 3:32 PM, Jake Smith  wrote:
>> > Are upstart jobs expected to conform to the LSB spec with regards
>> > to exit codes, etc?
>> > Is there any reference documentation using upstart resources in
>> > Pacemaker?
>> > Or any good advice :-)
>>
>> Newer versions of pacemaker and lrmd are able to deal with upstart
>> resources via dbus.

Only if the LRM is compiled with --enable-upstart, of course. Which,
to the best of my knowledge, is only set on the Ubuntu builds (and
Ubuntu builds are currently the only ones for which this makes sense
to set, obviously).

This, however, requires that you run with an updated libglib2 package
(again, only on Ubuntu). All of that should be available either in the
upstream Ubuntu repos or, for the current LTS, in the
ubuntu-ha-maintainers PPA.[1]

Hope this helps.

Cheers,
Florian

[1] "Why do I need to use a PPA if this release is ostensibly on
long-term support?" Don't ask me, ask someone from Canonical. :)

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-21 Thread Jake Smith
Thanks!

Jake 

- Original Message -
> From: "Vladislav Bogdanov" 
> To: pacemaker@oss.clusterlabs.org
> Sent: Wednesday, February 15, 2012 11:23:01 PM
> Subject: Re: [Pacemaker] Upstart resources
> 
> Hi,
> 
> 16.02.2012 02:02, Jake Smith wrote:
> > When using upstart jobs in Pacemaker I haven't been able to find
> > much of anything for documentation. After reading a post a few
> > minutes ago by
> > Andreas I wanted to verify...
> > 
> > Are upstart jobs expected to conform to the LSB spec with regards
> > to exit codes, etc?
> > Is there any reference documentation using upstart resources in
> > Pacemaker?
> > Or any good advice :-)
> 
> Newer versions of pacemaker and lrmd are able to deal with upstart
> resources via dbus.
> However I do not like this way, so please find resource-agent
> attached,
> which is able to manage arbitrary upstart job (just like Anything but
> for upstart resources). It already saved me much time and nerves
> managing libvirtd (with my own upstart job) which you probably
> already
> know always wants to SIGABRT (btw I even know the main reason for
> that
> and now testing patch which I will hopefully send to libvirt ml).
> 
> Best,
> Vladislav
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Upstart resources

2012-02-15 Thread Vladislav Bogdanov
Hi,

16.02.2012 02:02, Jake Smith wrote:
> When using upstart jobs in Pacemaker I haven't been able to find
> much of anything for documentation. After reading a post a few minutes ago by
> Andreas I wanted to verify...
> 
> Are upstart jobs expected to conform to the LSB spec with regards to exit 
> codes, etc?
> Is there any reference documentation using upstart resources in Pacemaker?
> Or any good advice :-)

Newer versions of pacemaker and lrmd are able to deal with upstart
resources via dbus.
However I do not like this way, so please find resource-agent attached,
which is able to manage arbitrary upstart job (just like Anything but
for upstart resources). It already saved me much time and nerves
managing libvirtd (with my own upstart job) which you probably already
know always wants to SIGABRT (btw I even know the main reason for that
and now testing patch which I will hopefully send to libvirt ml).

Best,
Vladislav
#!/bin/bash
#
# OCF resource agent which manages upstart jobs.
#
# Copyright (c) 2011 Vladislav Bogdanov 
#
# OCF instance parameters:
#OCF_RESKEY_job_name: name of upstart job
#OCF_RESKEY_process_name: name of process
#
# Initialization:

: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat}
. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs

# Defaults

meta_data() {
cat <


1.0


This RA manages upstart jobs as HA resources.

Manage upstart job





The name of the upstart job.
Can also contain job instance appended after space.
Example: job_name="my_job INSTANCE=1"

Job name





The name of the process which is to be launched by upstart job.

Process name





Additional command to run on mointor.

Additional monitor command





How many seconds to wait for check command to finish.

Monitor command timeout





What to run if monitor command fails or times out.

Monitor failure action














END
}

usage() {
cat <&1 )
monitor "${status}"
case $? in
$OCF_SUCCESS)
ocf_log info "Upstart job ${OCF_RESKEY_job_name} started 
successfully."
ret=$OCF_SUCCESS
;;
*)
ocf_log err "Failed to start upstart job ${OCF_RESKEY_job_name}."
ret=$OCF_ERR_GENERIC
;;
esac
return ${ret}
}

stop() {
local status=$1

monitor "${status}"
if [ $? -eq $OCF_NOT_RUNNING ]; then
return $OCF_SUCCESS
fi
status=$( initctl stop ${OCF_RESKEY_job_name} 2>&1 )
monitor "${status}"
case $? in
$OCF_NOT_RUNNING)
ocf_log info "Upstart job ${OCF_RESKEY_job_name} stopped 
successfully."
ret=$OCF_SUCCESS
;;
*)
ocf_log err "Failed to stop upstart job ${OCF_RESKEY_job_name}."
ret=$OCF_ERR_GENERIC
;;
esac
return ${ret}
}

get_status() {
local _output

_output=$( initctl status ${OCF_RESKEY_job_name} 2>&1 )
if echo "${_output}" | grep -q "Unknown job" ; then
ocf_log err "Unknown upstart job ${OCF_RESKEY_job_name}"
exit $OCF_ERR_INSTALLED
fi
# Leave only first line (main process)
_output=$( echo "${_output}" | awk '{print $0; exit}' )

# Store job status for later consumption
eval $1=\${_output}
}

monitor() {
local status=$1
local pid
local ret=$OCF_NOT_RUNNING
local process
# Operation timeout minus 5 seconds
local attempts=$((($OCF_RESKEY_CRM_meta_timeout/1000) - 5))
local i=0

if ocf_is_decimal ${OCF_RESKEY_check_timeout} ; then
attempts=$(( attempts - OCF_RESKEY_check_timeout ))
fi

if [ ${attempts} -le 0 ] ; then
attempts=0
fi

# We first receive output from outside, then re-poll for it
while [ ${ret} -eq $OCF_NOT_RUNNING ] ; do
# upstart can report:
#  (instance) start/[running|pre-start], process (item0) pid
if [[ "${status}" =~ (^${OCF_RESKEY_job_name}( \(.+\)){0,1} 
start/([a-z-]+), process (\(.+\) ){0,1}([0-9]+)) ]] ; then
state=${BASH_REMATCH[3]}
case ${state} in
running)
pid=${BASH_REMATCH[5]}
if [ -n "${pid}" ] ; then
kill -0 ${pid}
if [ $? -eq 0 ] ; then
process=$( awk '/^Name:/ {print $2}' < 
/proc/${pid}/status )
if [ "${process}" != "${OCF_RESKEY_process_name}" ] 
; then
# job is started, but it did not yet launched 
process itself
(( i == 0 )) && ocf_log info "pid ${pid} 
corresponds to process ${process} instead of ${OCF_RESKEY_process_name}, 
waiting."
ret=$OCF_NOT_RUNNING
else
ret=$OCF_SUCCESS
fi
else
# This will cause resource to be marked as 'Started 
FAILED'
# with subsequent stop and sta