> On 23 Sep 2016, at 12:49 AM, Ken Gaillot <[email protected]> wrote: > > On 09/22/2016 08:49 AM, Adam Spiers wrote: >> Ken Gaillot <[email protected]> wrote: >>> On 09/21/2016 03:25 PM, Adam Spiers wrote: >>>> Jan Pokorný <[email protected]> wrote: >>>>> Just thinking aloud before the can is open. >>>> >>>> Thanks for sharing - I'm very interested to hear your ideas on this, >>>> because I was thinking along somewhat similar lines for the >>>> openstack-resource-agents repository which I maintain. >>>> >>>> Currently the OpenStack RAs duplicate much of the logic and config of >>>> corresponding systemd / LSB init scripts for starting / stopping >>>> OpenStack services and checking their status. The main difference is >>>> that RAs also have a "monitor" action which can check the health of >>>> the service at application level, e.g. via HTTP rather than a naive >>>> "is this pid running" kind of check. >>>> >>>> This duplication causes issues with portability between Linux >>>> distributions, since each distribution has a slightly different way of >>>> starting and stopping the services. It also results in subtlely >>>> different behaviour for OpenStack clouds depending on whether or not >>>> they are deployed in HA mode using Pacemaker. >>>> >>>> As a result I have been thinking about the idea of changing the >>>> start/stop/status actions of these RAs so that they wrap around >>>> service(8) (which would be even more portable across distros than >>>> systemctl). >>>> >>>> The primary difference with your approach is that we probably wouldn't >>>> need to make the RAs dynamically create any systemd configuration, since >>>> that would already be provided by the packages which install the OpenStack >>>> services. But then AFAIK none of the OpenStack services use the >>>> multi-instance feature of systemd (foo@{one,two,three,etc}.service). >>> >>> The main complication I see is that pacemaker expects OCF agents to >>> return success only after an action is complete. For example, start >>> should not return until the service is fully active. I believe systemctl >>> does not behave this way, rather it initiates the action and returns >>> immediately. >> >> But that's trivial to work around: polling via "service foo status" >> after "service foo start" converts it back from an asynchronous >> operation to a synchronous one. > > Yes, that's exactly what pacemaker does now: start/stop, then every two > seconds, poll the status. > > However, I'm currently working on a project to change that, so that we > use DBus signalling to be notified when the job completes, rather than > (or in addition to) polling. > > The reason is twofold: the two-second wait can be an unnecessary > recovery delay in some cases; and (at least from the DBus API, not sure > about systemctl status) there's no reliable way to distinguish "service > is inactive because the start didn't work properly" from "service is > inactive because systemd has some slow-starting dependencies of its own > to start first”.
The systemd folks are telling us that the only real way reliably synchronously start a service is by watching DBus, which suggests that a shell based approach is doomed to fail. > >>> Pacemaker's native systemd integration has a lot of workarounds for >>> quirks in systemd behavior (and more every release). I'm not sure >>> moving/duplicating that logic to the RA is a good approach. >> >> What other quirks are there? > > When pacemaker starts a systemd service, it creates a unit override in > /run/systemd/system/<agent>.service.d/50-pacemaker.conf, with these > overrides (and removes the file when stopping the resource): > > * It prefixes the description with "Cluster Controlled" (e.g. "Postfix > Mail Transport Agent" -> "Cluster Controlled Postfix Mail Transport > Agent"). This gives a clear indicator in systemd messages in the syslog > that it's a cluster resource. > > * "Before=pacemaker.service": This ensures that when someone shuts down > the system via systemd, systemd doesn't stop pacemaker before pacemaker > can stop the resource. > > * "Restart=no": This ensures that pacemaker stays in control of > responding to service failures. > > > Additionally: > > * Pacemaker uses intelligent timeout values (based on cluster > configuration) when making systemd calls. > > * Pacemaker interprets/remaps systemd return status as needed. For > example, a stop followed by a status poll that returns "OK" means the > service is still running. Fairly obvious, but there are a lot of cases > that need to be handled. > > All of these were added gradually over the past few years, so I'd expect > the list to grow over the next few years. > > > _______________________________________________ > Developers mailing list > [email protected] <mailto:[email protected]> > http://clusterlabs.org/mailman/listinfo/developers > <http://clusterlabs.org/mailman/listinfo/developers>
_______________________________________________ Developers mailing list [email protected] http://clusterlabs.org/mailman/listinfo/developers
