Hello,

With a Pacemaker 1.1.13/Corosync 2.3.5 cluster is it possible to define a 
relationship between two resources so that:

1.      B depends on A (a normal order constraint)

AND

2.      If either fails, they both need to be stopped and restarted, in the 
order defined above (B stops, A stops, A starts, then B starts)



In the normal configuration, if A fails, then A and B will be restarted, 
because B depends on A.  However, if B fails, only B is restarted because A 
does not depend on it.  In most cases this is going to be fine, but we have a 
case where in some situations B is failing precisely because A above it is 
having a failure (but we don't know it yet).



The order attribute takes care of the ordering of the start/stop (along with 
adding colocation so they stay on the same node).



The problem I am trying to address is the case where the monitor for B fires 
first, and B is attempted to be restarted, but it won't work until A is.



Case in point, LVM and Filesystem2 resources.



If LVM needs to be refreshed, the Filesystem above it stops working (e.g. I/O 
fails).  However, Filesystem noticed a problem first, and LVM didn't have a 
chance to see it also had a problem.  Therefore, Filesystem will try to restart 
itself until it exhausts its retries.  At that point, a cleanup is required to 
get things going again, and LVM has to be manually restarted.



We have a case where the LVM cache needs to be refreshed and the volumes 
reactivated to clear up a problem caused by paths going down and coming back up 
in a SAN causing the LVM VG to get in a compromised state, and the LVM problem 
causes the Filesystem I/O to fail, and Filesystem notices first, monitor fails, 
it stops itself, and tries in vain to restart, because it will not until the 
LVM resource is restarted.



I made the monitor interval longer for Filesystem than LVM which makes LVM find 
the problem first, but that isn't foolproof.

If it was a rule that if a Filesystem resource needs to be stopped and started 
that the LVM resource it depends on has to be restarted first, I should be able 
to avoid the problem entirely.



In essence, what I'm asking is if I can make two resource start and stop in a 
particular order, but also define that if one has to be started or stopped the 
other must as well (in my defined order).



Thanks.



Greg Neitzert | Lead Software Engineer | RTC Software Engineering 2B - 
Middleware

Unisys Corp



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to