Hi Florian,

I've completed making most of the changes you requested of me for the 
dhcpd resource agent. To summarize the changes they are as follows:

- The script has been run through checkbashisms and all bash related 
elements have been cleared, the script should be strictly sh only at 
this point.
- The path to the ocf-shellfuncs has been corrected.
- All variables have had the dhcpd_ prefix removed.
- The binary specific variable now has _binary as part of the name.
- The pid variable has been renamed from 
OCF_RESKEY_dhcpd_pidfile_default / OCF_RESKEY_dhcpd_pidfile to 
OCF_RESKEY_dhcpd_pid_default / OCF_RESKEY_dhcpd_pid
- The chrooted variable has been removed.
- All _monitor variables and functions have been removed for now, as it 
was a failed attempt to validate DHCP itself. I need to re-think on that 
part.
- The meta data entries have been updated to reflect these changes.
- The usage function always returns success now.
- Required and unique variables are now appropriately tagged as such.
- The meta data description for the binary path has been reworded.
- The unused dhcpd_getpid function has been removed.
- The dhcpd_reload function has also been removed, it was simply doing a 
full restart, vs a configuration reload.
- Additional error checking has been added to the dhcpd_stop function to 
catch missed conditions.
- The redundant check on meta_timeout has been removed, this value is 
supposed to always be there.

I want to wrap line 266 (of the updated dhcpd script) within an ocf_run 
call, but for some reason each time I do it returns an error, the line I 
had used was as follows:

     ocf_run "${OCF_RESKEY_binary} -cf ${OCF_RESKEY_config} $DHCPD_ARGS 
-pf ${OCF_RESKEY_pid} ${OCF_RESKEY_interface}"

I kept getting a "file not found" error, and could not track down what 
part was kicking that error out. I even used a full path for the dhcpd 
daemon and got the same error, so it looks like it was trying to 
determine a path for one of the parameters... :/

In addition, you made the suggestion to warp my kill -9 call in the 
dhcpd_stop function with an ocf_run, but I need an indication as to what 
you have in mind to fully understand why that might be worth using 
(simply a familiarity issue on my part it all).

Lastly, all testing thus far still shows the script fails to properly 
fail over to my secondary node, however, it will fail back to the 
primary node with out issue. I'm inclined to believe one of two things 
is happening:

1. The secondary node has something "stuck" in its tracking always 
telling the cluster that this script is "not installed":

Failed actions:
     dhcpd_service_monitor_0 (node=dhcp-vm02, call=5, rc=5, 
status=complete): not installed

2. I am missing something in my script that will enable a clean fail over.

Thanks
Chris.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to