On Wed, Aug 17, 2022 at 1:01 PM Ken Gaillot <kgail...@redhat.com> wrote: > > Hi all, > > OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF > 1.2 (which would remain backward-compatible). > > One big addition I'm contemplating is defining OCF resource agent > types, to address these problems: > > * Fence agents have a completely different standard from OCF resource > agents, and lack some of the features available to OCF agents (such as > meaningful error statuses and exit reasons for failures). > > * Pacemaker's node health feature uses OCF agents to monitor node > conditions, but there are some user pain points involved since they are > indistinguishable from regular OCF agents. > > * In the past there has been discussion of implementing "storage > agents" to help manage replication of external storage devices, > primarily for disaster recovery purposes. > > Visually, the agent type would be another field in > the agent specification, for example ocf:fence:heartbeat:iscsi or > ocf:health:pacemaker:cpu. > > "Regular" OCF agents would be (for example) > ocf:service:heartbeat:apache in full, but for backward compatibility > "service" would be the default, and ocf:heartbeat:apache would continue > to work. > > Alternatively, if we want to keep it to three fields, we could do > something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu. > > The OCF standard would have a shared section that all agent types would > be required to support. This could include things like exit status > codes, environment variables, and the meta-data action. Each agent type > would then have its own section with anything specific to that type -- > for example, service agents need to support start and stop actions, > while fence agents need to support off and optionally reboot. > > The benefits would include: > > * Agent writers would have fewer differences to worry about and > libraries to learn. > > * Pacemaker and higher-level tools could easily distinguish agent types > and respond intelligently. For example, higher-level shells could list > all health agents and clone them automatically when used, and Pacemaker > could automatically exempt health agents from health restrictions so > that the agent can automatically detect when the node becomes healthy > again. > > * We would have a framework for adding new types if the need arises. > > Thoughts?
It sounds like a good idea. With regard to "service" as the default OCF resource agent type, this may be confusing since we already have a "service" standard. > -- > Ken Gaillot <kgail...@redhat.com> > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/developers > > ClusterLabs home: https://www.clusterlabs.org/ > -- Regards, Reid Wahl (He/Him) Senior Software Engineer, Red Hat RHEL High Availability - Pacemaker _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/