Hi Everyone,

  I would like to thank Florian, Andreas and Dejan for making 
suggestions and pointing out some additional changed I should make. At 
this point the following additional changes have been made:

- A test case in the validation function for ocf_is_probe has been 
reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to 
ensure the validation is not occuring if the partition is not mounted or 
under a probe.
- An extraneous return code has been removed from the "else" clause of 
the probe test, to ensure the rest of the validation can finish.
- The call to the DHCPD daemon itself during the start phase has been 
wrapped with the ocf_run helper function, to ensure that is somewhat 
standardized.

The first two changes corrected the "Failed Action... Not installed" 
issue on the secondary node, as well as the fail-over itself. I've been 
able to fail over to secondary and primary nodes multiple times and the 
service follows the rest of the grouped services.

There are a few things I'd like to add to the script, now that the main 
issues/code changes have been addressed, and they are as follows:

- Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX 
from within the script. The logic behind this is as follows:

1. It is possible for an admin to use a 3rd party management tool to 
add/remove/update addresses in the /etc/dhcpd.conf file while the 
cluster is live. There needs to be a means of detecting those updates, 
and ensuring they are propagated to the remaining nodes.

2. While a user may be using drbd to handle the [chrooted_path] 
partition to propagate lease information across nodes, there's no 
guarantee they are using drbd to manage more then just that area. For 
instance, I am not using drbd to manage the /etc/ path, but simply 
/var/lib/dhcp.

The script already ensures the /etc/dhcpd.conf file is copied into the 
chrooted environment, as is the standard for the current DHCPD init 
scripts already used on many Linux distributions in a non-clustered 
environment.

- I need to find a means to add additional monitoring to the script to 
do more then simply test if the daemon is live. I've had cases where the 
dhcpd daemon was live but not feeding out IP's, and it would be nice to 
fence the node out if I could find way to to validate that it's not 
responding to IP requests in addition to a daemon failure. The issue is 
that dhcpcd, using the -T parameter, can not run on the same Ethernet 
interface (for single NIC nodes) as the dhcpd process is running on, as 
it will never get a response.

Is it possible to have another node execute this, and restrict that part 
of the test to only the passive node(s) ?
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to