Hi all, OCF 1.1 hasn't been out that long but I'm already looking ahead to OCF 1.2 (which would remain backward-compatible).
One big addition I'm contemplating is defining OCF resource agent types, to address these problems: * Fence agents have a completely different standard from OCF resource agents, and lack some of the features available to OCF agents (such as meaningful error statuses and exit reasons for failures). * Pacemaker's node health feature uses OCF agents to monitor node conditions, but there are some user pain points involved since they are indistinguishable from regular OCF agents. * In the past there has been discussion of implementing "storage agents" to help manage replication of external storage devices, primarily for disaster recovery purposes. Visually, the agent type would be another field in the agent specification, for example ocf:fence:heartbeat:iscsi or ocf:health:pacemaker:cpu. "Regular" OCF agents would be (for example) ocf:service:heartbeat:apache in full, but for backward compatibility "service" would be the default, and ocf:heartbeat:apache would continue to work. Alternatively, if we want to keep it to three fields, we could do something like ocf-fence:heartbeat:iscsi and ocf-health:pacemaker:cpu. The OCF standard would have a shared section that all agent types would be required to support. This could include things like exit status codes, environment variables, and the meta-data action. Each agent type would then have its own section with anything specific to that type -- for example, service agents need to support start and stop actions, while fence agents need to support off and optionally reboot. The benefits would include: * Agent writers would have fewer differences to worry about and libraries to learn. * Pacemaker and higher-level tools could easily distinguish agent types and respond intelligently. For example, higher-level shells could list all health agents and clone them automatically when used, and Pacemaker could automatically exempt health agents from health restrictions so that the agent can automatically detect when the node becomes healthy again. * We would have a framework for adding new types if the need arises. Thoughts? -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/developers ClusterLabs home: https://www.clusterlabs.org/