Every time a new VM is started up, there is a 2 second outage in DNS
services that can cause problems in guest VMs that use the router VM for
DNS.

 

For Cloudstack configurations using both DHCP and DNS services on the router
VM (both implemented with dnsmasq), there is currently a 2 second DNS
service outage every time a new VM is instantiated

 

The source of this outage is in edithosts.sh, which uses "service dnsmasq
restart" to pick up the freshly added DNS and DHCP entries.

Restarting the dnsmasq service triggers a sleep for 2 seconds after killing
dnsmasq before starting it back up again.

 

An obvious solution would be to replace "service dnsmasq restart" with "kill
-s 1 $pid" (SIGHUP) so that dnsmasq reads the new DHCP entries without
restarting, as in dnsmasq_edithosts.sh (external dhcp).

 

Unfortunately, this solution is flawed because dnsmasq SIGHUP handling does
not expire in-memory DHCP leases in dnsmasq and all leases are infinite by
default.

Thus, this will only work if the guest VM performs a DHCP release on
shutdown, which cannot always be guaranteed.

 

A few possible solutions off the top of my head:

1.       Separate DNS and DHCP services.  While DHCP services still
experience an outage during VM,  DNS will not necessarily be impacted if
implemented correctly.

2.       Use SIGHUP with dnsmasq and implement a removeDhcpEntry interface
for network appliances to force a DHCP release whenever a NIC / IP is
deallocated.  This can use dhcp_release to simulate a DHCP release on the
router VM.
Catch: dhcp_release is not available for Debian 6.0.  The System VM needs to
be updated to at least Debian 7.0, or the dnsmasq-tools .deb from 7.0 would
need to be included in the System VM image.

3.       Change DHCP to have a shorter lease, track de-allocation of IPs
separately from VM destruction.  
Catch: This may cause occasional IP pool exhaustion depending on allocation
of the guest IP range and the rate of VM destruction / instantiation in the
network.

 

Thoughts?

Reply via email to