Ken Gaillot <[email protected]> wrote:
On 11/03/2016 02:37 PM, Adam Spiers wrote:
Hi again Ken,
Sorry for the delayed reply, caused by Barcelona amongst other things ...
Hmm, seems I have to apologise for yet another delayed reply :-( I'm
deliberately not trimming the context, so everyone can refresh their
memory of this old thread!
Ken Gaillot <[email protected]> wrote:
On 10/21/2016 07:40 PM, Adam Spiers wrote:
Ken Gaillot <[email protected]> wrote:
On 09/26/2016 09:15 AM, Adam Spiers wrote:
For example, could Pacemaker be extended to allow hybrid resources,
where some actions (such as start, stop, status) are handled by (say)
the systemd backend, and other actions (such as monitor) are handled
by (say) the OCF backend? Then we could cleanly rely on dbus for
collaborating with systemd, whilst adding arbitrarily complex
monitoring via OCF RAs. That would have several advantages:
1. Get rid of grotesque layering violations and maintenance boundaries
where the OCF RA duplicates knowledge of all kinds of things which
are distribution-specific, e.g.:
https://github.com/ClusterLabs/resource-agents/blob/master/heartbeat/apache#L56
A simplified agent will likely still need distro-specific intelligence
to do even a limited subset of actions, so I'm not sure there's a gain
there.
What distro-specific intelligence would it need? If the OCF RA was
only responsible for monitoring, it wouldn't need to know a lot of the
things which are only required for starting / stopping the service and
checking whether it's running, e.g.:
- Name of the daemon executable
- uid/gid it should be started as
- Daemon CLI arguments
- Location of pid file
In contrast, an OCF RA only responsible for monitoring would only need
to know how to talk to the service, which is not typically
distro-specific; in the REST API case, it only needs to know the endpoint
URL, which would be configured via Pacemaker resource parameters anyway.
If you're only talking about monitors, that does simplify things. As you
mention, you'd still need to configure resource parameters that would
only be relevant to the enhanced monitor action -- parameters that other
actions might also need, and get elsewhere, so there's the minor admin
complication of setting the same value in multiple places.
Which same value(s)?
Nothing particular in mind, just app-specific information that's
required in both the app's own configuration and in the resource
configuration. Even stuff like an IP address/port.
That could potentially be reduced by simply pointing the RA at the
same config file(s) which the systemd service definition uses, and
then it could use something like crudini (in the OpenStack case) to
extract values like the port the service is listening on. (Assuming
the service is listening on localhost, you could simply use that
instead of the external IP address.) So yes, you'd have to set the
location of the config file(s) in two places, but at least then you
wouldn't have to duplicate anything else.
In the OpenStack case (which is the only use case I have), I don't
think this will happen, because the "monitor" action only needs to
know the endpoint URL and associated credentials, which doesn't
overlap with what the other actions need to know. This separation of
concerns feels right to me: the start/stop/status actions are
responsible for managing the state of the service, and the monitor
action is responsible for monitoring whether it's delivering what it
should be. It's just like the separation between admins and end
users.
2. Drastically simplify OCF RAs by delegating start/stop/status etc.
to systemd, thereby increasing readability and reducing maintenance
burden.
3. OCF RAs are more likely to work out of the box with any distro,
or at least require less work to get working.
4. Services behave more similarly regardless of whether managed by
Pacemaker or the standard pid 1 service manager. For example, they
will always use the same pidfile, run as the same user, in the
right cgroup, be invoked with the same arguments etc.
5. Pacemaker can still monitor services accurately at the
application-level, rather than just relying on naive pid-level
monitoring.
Or is this a terrible idea? ;-)
I considered this, too. I don't think it's a terrible idea, but it does
pose its own questions.
* What hybrid actions should be allowed? It seems dangerous to allow
starting from one code base and stopping from another, or vice versa,
and really dangerous to allow something like migrate_to/migrate_from to
be reimplemented. At one extreme, we allow anything and leave that
responsibility on the user; at the other, we only allow higher-level
monitors (i.e. using OCF_CHECK_LEVEL) to be hybridized.
Just monitors would be good enough for me.
The tomcat RA (which could also benefit from something like this) would
extend start and stop as well, e.g. start = systemctl start plus some
bookkeeping.
Ahh OK, interesting. What kind of bookkeeping?
I don't remember ... something like node attributes or a pid/status
file. That could potentially be handled by separate OCF resources for
those bits, grouped with the systemd+monitor resource, but that would be
hacky and maybe insufficient in some use cases.
I have no idea if the proposal makes sense in the case of the tomcat
RA. But if it doesn't, I hope that alone doesn't rule out the
proposal being implemented to satisfy other use cases :-)
* Should the wrapper's actions be done instead of, or in addition to,
the main resource's actions? Or maybe even allow the user to choose? I
could see some wrappers intended to replace the native handling, and
others to supplement it.
For my use case, in addition, because the only motivation is to
delegate start/stop/status to systemd (as happens currently with
systemd:* RAs) whilst retaining the ability to do service-level
testing of the resource via the OCF RA. So it wouldn't really be a
wrapper, but rather an extension.
In contrast, with the wrapper approach, it sounds like the delegation
would have to happen via systemctl not via Pacemaker's dbus code. And
if systemctl start/stop really are asynchronous non-blocking, the
delegation would need to be able to wrap these start/stop calls in a
polling loop as previously mentioned, in order to make them
synchronous non-blocking (which is the behaviour I think most people
would expect).
Someone else suggested that systemctl is already blocking, which would
simplify the wrapper approach.
I'm not sure, but IIRC Andrew said that it is not reliably blocking.
Maybe this is in part due to service startup after daemonization,
which unavoidably happens in an asynchronous manner.
I suppose that would be service-specific.
Most likely, yes.
At least systemd's own
handling can be synchronous (I see systemctl requires --no-block to be
asynchronous, so it does seem like this is the case). I think this makes
the wrapper approach much more likely to work. Some polling may be
necessary for specific services that take some time before a monitor
would be OK, but I don't think that's a major issue.
Yeah, that sounds OK to me.
* The answers to the above will help decide whether the wrapper is a
separate resource (with its own parameters, operations, timeouts, etc.),
or just a property of the main resource.
* If we allow anything other than monitors to be hybridized, I think we
get into a pacemaker-specific implementation. I don't think it's
feasible to include this in the OCF standard -- it would essentially
mandate pacemaker's "resource class" mechanism on all OCF users (which
is beyond OCF's scope), and would likely break manual/scripted use
altogether. We could possibly modify OCF so that agents so that no
actions are mandatory, and it's up to the OCF-using software to verify
that any actions it requires are supported. Or maybe wrappers just
implement some actions as no-ops, and it's up to the user to know the
limitations.
Sure. Hopefully you will be in Barcelona so we can discuss more?
Sadly, no :)
If systemctl blocks, the wrapper approach would be easiest -- the
wrapper can conform to the OCF standard, and requires no special
handling in pacemaker.
Well, it would still need a common RA library which provides a
function for writing the systemd service override, right?
That's trivial enough to put in the RA for now. Once we demonstrate this
approach works, we can maybe include it with pacemaker somehow.
Makes sense.
The hybrid approach would require some sort of "almost OCF" standard for
the extender, new syntax in pacemaker to configure it, and a good bit of
intelligence in pacemaker (e.g. recovery if one part of the hybrid fails
or times out).
Yeah. I don't think I have a preference on either approach; I'd be
happy to go with whichever you think is best. I guess there is an
argument in favour of trying out a spike on the wrapper approach
first, since presumably it should be much easier to implement than the
hybrid approach. Thoughts?
Agreed
OK, so I guess this is blocking on me to do the spike. I'll be at the
Atlanta PTG in a couple of weeks, so if anyone else reading this
thread happens to be involved in OpenStack and attending the PTG, drop
me a line!
_______________________________________________
Developers mailing list
[email protected]
http://lists.clusterlabs.org/mailman/listinfo/developers