Hi Everyone, I would like to thank Florian, Andreas and Dejan for making suggestions and pointing out some additional changed I should make. At this point the following additional changes have been made:
- A test case in the validation function for ocf_is_probe has been reversed tp ! ocf_is_probe, and the "test"/"[ ]" wrappers removed to ensure the validation is not occuring if the partition is not mounted or under a probe. - An extraneous return code has been removed from the "else" clause of the probe test, to ensure the rest of the validation can finish. - The call to the DHCPD daemon itself during the start phase has been wrapped with the ocf_run helper function, to ensure that is somewhat standardized. The first two changes corrected the "Failed Action... Not installed" issue on the secondary node, as well as the fail-over itself. I've been able to fail over to secondary and primary nodes multiple times and the service follows the rest of the grouped services. There are a few things I'd like to add to the script, now that the main issues/code changes have been addressed, and they are as follows: - Add a means of copying /etc/dhcpd.conf from node1 to node2...nodeX from within the script. The logic behind this is as follows: 1. It is possible for an admin to use a 3rd party management tool to add/remove/update addresses in the /etc/dhcpd.conf file while the cluster is live. There needs to be a means of detecting those updates, and ensuring they are propagated to the remaining nodes. 2. While a user may be using drbd to handle the [chrooted_path] partition to propagate lease information across nodes, there's no guarantee they are using drbd to manage more then just that area. For instance, I am not using drbd to manage the /etc/ path, but simply /var/lib/dhcp. The script already ensures the /etc/dhcpd.conf file is copied into the chrooted environment, as is the standard for the current DHCPD init scripts already used on many Linux distributions in a non-clustered environment. - I need to find a means to add additional monitoring to the script to do more then simply test if the daemon is live. I've had cases where the dhcpd daemon was live but not feeding out IP's, and it would be nice to fence the node out if I could find way to to validate that it's not responding to IP requests in addition to a daemon failure. The issue is that dhcpcd, using the -T parameter, can not run on the same Ethernet interface (for single NIC nodes) as the dhcpd process is running on, as it will never get a response. Is it possible to have another node execute this, and restrict that part of the test to only the passive node(s) ? _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/