On 10/07/23 12:00 +0300, Or Raz wrote:
Hi all,
My team has been working on a new operator which uses some of your Fence
Agents (FA, i.e. fence_aws, fence_ipmilan, and etc.) to remediate an
unhealthy Kubernetes node (see fence-agents-remediation
<https://github.com/medik8s/fence-agents-remediation>).
After looking on the recommended attributes for new FAs
<https://github.com/ClusterLabs/fence-agents/blob/main/doc/FenceAgentAPI.md#attribute-specifications>,
and the valid *action *attribute values
<https://github.com/ClusterLabs/fence-agents/blob/main/doc/FenceAgentAPI.md#agent-operations-and-return-values>
,
I have some questions on the structure/format of the FAs command attributes
and their responses:

  1. Will running a fence-agent without mentioning the action field will
  always choose the *reboot* option (e.g. the following call will reboot
  the node "fence_aws --access-key ACCESS_KEY --secret-key SECRET_KEY --plug
  i-INSTANCE_ID --region AWS_REGION")?
That's the default, yeah. It's probably due to some earlier Pacemaker
versions not specifying action, so we avoid breaking the agents on
those versions by defaulting to the reboot-action.
  2. Are there any must-have fields which are shared between all the FAs
  that you support? I assume the answer is no, since I didn't see any must
  fields which are mutual between *fence_aws*, and *fence_ipmilan* for
  instance. A must field is a field which is required for running the FA
  (e.g. *access-key* for *fence_aws)*.
That depends on what kind of agent it is, so e.g. http(s) agents or
similar will require url and other parameters that are needed.

You can find a list of default/other groups of parameters depending on
agent when you add them via device_opt = [] here:
https://github.com/ClusterLabs/fence-agents/blob/main/lib/fencing.py.py#L492

and full list of common parameters:
https://github.com/ClusterLabs/fence-agents/blob/main/lib/fencing.py.py#L36
  3. Do the result responses to the FA are identical per action? E.g. For
  the *reboot* action, I have seen that on success I always receive
  `Success: Rebooted` for fence_aws, and fence_ipmilan. I am an
  uncertain whether that is correct for all the FAs.
They should, but you should use the return code to check the result.
https://github.com/ClusterLabs/fence-agents/blob/main/lib/fencing.py.py#L20-L32

You can run "echo $?" to show the result after running e.g. fence_aws
-o reboot manually to see which rc it returns, and use incorrect
credentials or similar to see the difference in rc when it fails.


Oyvind

Best regards,
*OR*

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to