Public bug reported: TL; DR;
nova-dhcpbridge init generated leases file with instance updated_at instead of fixed_ips updated_at, causing every leases to be expired for several month. So everytime dnsmasq is restarted will sent DHCPNAK the first time a client ask for renew its lease. Long version: dnsmasq expected the leases format to have: * one line per lease * first column is the expire date in second since epoc. * we don’t care about other column for this issue :) This is the case for dnsmasq 2.59 and 2.65 (precise version and raring version). But the output of nova-dhcpbridge init (which is called by dnsmasq to load the leases): 1352420950 xx:xx:3e:01:7d:xx 10.0.0.3 app01.domain * 1352421657 xx:xx:3e:7e:0a:xx 10.0.0.4 app02.domain * [...] So the expire date for those entry are: 9 November 2012 around 1am. The script was run the 25 January 2013 at 9am (UTC). With our lease time of 1 day, we expected an expire date at 25 January 2013 at 10am. So when loaded dnsmasq read all leases and found all leases expired and then discard all leases. This cause dnsmasq to reply DHCPNAK for DHCPREQUEST (since for dnsmasq the lease requested didn’t exist because it’s expired). Hopefully, when client come with a DHCPDISCOVER, dnsmasq will get information form configuration file (/var/lib/nova/networks /nova-brxxx.conf) and create a lease with correct expire time. But this lease is only tracker in memory, so next DHCPREQUEST will work until next restart of dnsmasq. At the end everytime dnsmasq is restarted, when client try to renew a lease it will get a DHCPNAK and it’s interface goes down (loss all IP). Even if the DHCPDISCOVER send right after will re-add the IP, this can trouble some services (in our case, pacemaker which manage a virtual IP). Digging a bit on how nova-dhcpbridge generated the leases file, it seems to come from: * _host_lease function in nova/network/linux_net.py: if data['instance_updated']: timestamp = data['instance_updated'] else: timestamp = data['instance_created'] seconds_since_epoch = calendar.timegm(timestamp.utctimetuple()) return '%d %s %s %s *' % (seconds_since_epoch + FLAGS.dhcp_lease_time, data['vif_address'], data['address'], data['instance_hostname'] or '*') data[‘instance_updated’] is took from table instances, and it match the date seen in output of nova-dhcpbridge init. It’s also the date of creation of our machine (more or less few minutes... probably the end of first boot). >From my understanding of how nova-dhcpbridge works, every time dnsmasq reply to a client with a new lease, it call the nova-dhcpbridge script which update the database (table fixed_ips, column updated_at). So I think instead of “instance_updated”, we sould use “fixed_ips.updated_at” when generating the leases. Version of software (Ubuntu version): * Ubuntu 12.04 (precise) amd64 * nova-* 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1 * dnsmasq 2.65-1~precise1 The way leases file are generated by nova-dhcpbridge (_host_lease function in nova/network/linux_net.py) is present in nova git repository at both tag 2012.1.3 (b00f759) and master (97a5274 - dated of Thu Jan 24). ** Affects: nova Importance: Undecided Status: New ** Affects: nova (Ubuntu) Importance: Undecided Status: Confirmed ** Also affects: nova (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1104915 Title: Wrong expire date in nova-dhcpbridge init output To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1104915/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs