David Teigland wrote:
fencing: New option '--missing-as-off' to return OFF is machine is missing
If a blade is not present (i.e. removed for maintenance), the fence_bladecenter
cannot check the state as it is reported empty.
Resolves: bz#248006
--- a/fence/agents/bladecenter/fence_bladecenter.py
+++ b/fence/agents/bladecenter/fence_bladecenter.py
@@ -30,7 +30,10 @@ def get_power_status(conn, options):
i = conn.log_expect(options, [ node_cmd, "system>" ] ,
int(options["-Y"]))
if i == 1:
## Given blade number does not exist
- fail(EC_STATUS)
+ if options.has_key("-M"):
+ return "off"
+ else:
+ fail(EC_STATUS)
I've never used bladecenter, so I don't know when a blade number doesn't
exist. Does it reliably indicate that the blade is off? If so, then
should we default to that without a new option? If not, then this option
sounds bad, because it's effectively an automation of manual override, no?
Yes, it is reliable. We can turn it on as default but imho it will cause
inconsistency on how fence agents works as all other refuse to work with
port that does not exist. In general we can divide fence agents in
several categories:
* iLO-like -> 1 machine we have to connect to --> if we can't connect
we don't know if problem is in iLO or connection to machine
* APC-like -> N machines connected to central device --> we can switch
outlet ON/OFF and we don't have to care if there is something connected
* virtual machines -> we usually don't need to remove them for maintaince
* Blade-like -> unlike APC we can work only with machines that are
connected - so maintainance can be problem
The typical issue with nodes removed for maintenance is that startup
fencing tries to fence them and can't, for any agent. The solution to
that has always been manual override or removing the node from
cluster.conf.
m,