Hi Florian, I've completed making most of the changes you requested of me for the dhcpd resource agent. To summarize the changes they are as follows:
- The script has been run through checkbashisms and all bash related elements have been cleared, the script should be strictly sh only at this point. - The path to the ocf-shellfuncs has been corrected. - All variables have had the dhcpd_ prefix removed. - The binary specific variable now has _binary as part of the name. - The pid variable has been renamed from OCF_RESKEY_dhcpd_pidfile_default / OCF_RESKEY_dhcpd_pidfile to OCF_RESKEY_dhcpd_pid_default / OCF_RESKEY_dhcpd_pid - The chrooted variable has been removed. - All _monitor variables and functions have been removed for now, as it was a failed attempt to validate DHCP itself. I need to re-think on that part. - The meta data entries have been updated to reflect these changes. - The usage function always returns success now. - Required and unique variables are now appropriately tagged as such. - The meta data description for the binary path has been reworded. - The unused dhcpd_getpid function has been removed. - The dhcpd_reload function has also been removed, it was simply doing a full restart, vs a configuration reload. - Additional error checking has been added to the dhcpd_stop function to catch missed conditions. - The redundant check on meta_timeout has been removed, this value is supposed to always be there. I want to wrap line 266 (of the updated dhcpd script) within an ocf_run call, but for some reason each time I do it returns an error, the line I had used was as follows: ocf_run "${OCF_RESKEY_binary} -cf ${OCF_RESKEY_config} $DHCPD_ARGS -pf ${OCF_RESKEY_pid} ${OCF_RESKEY_interface}" I kept getting a "file not found" error, and could not track down what part was kicking that error out. I even used a full path for the dhcpd daemon and got the same error, so it looks like it was trying to determine a path for one of the parameters... :/ In addition, you made the suggestion to warp my kill -9 call in the dhcpd_stop function with an ocf_run, but I need an indication as to what you have in mind to fully understand why that might be worth using (simply a familiarity issue on my part it all). Lastly, all testing thus far still shows the script fails to properly fail over to my secondary node, however, it will fail back to the primary node with out issue. I'm inclined to believe one of two things is happening: 1. The secondary node has something "stuck" in its tracking always telling the cluster that this script is "not installed": Failed actions: dhcpd_service_monitor_0 (node=dhcp-vm02, call=5, rc=5, status=complete): not installed 2. I am missing something in my script that will enable a clean fail over. Thanks Chris. _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/