I attempted to reproduce the bind9 issue by doing the following (in two
separate sessions):

# Queue 10,000 concurrent reloads (also tried removing the & to make it less 
parallel)
i=0; while [ $i -lt 10000 ]; do (/usr/sbin/rndc reload&); let i=$i+1; done

# Hammer the DNS server with queries
while [ 1 ]; do dig @127.0.0.1 <maas-hostname>; done

Everything works properly when I do this by itself. But if I have
parallel reload requests running *and* I make manual changes to the DNS
zones in /etc/bind/maas, I have observed bind9 behaving badly, including
(eventually) what seemed to be the deadlock (but my bind9 was older, so
my debug symbols didn't match).[1] Then I observed a similar state where
after I updated the zone file, it was as if nothing changed (bind9 was
returning old data, which didn't resolve itself until I did "service
bind9 restart").

It's my impression that the problem is worse when I do reloads in
parallel. So this is more evidence pointing to "we should ensure MAAS
never tries to reload bind9 twice in parallel".

[1]:
First observed extreme sluggishness in resolving queries, which resolved itself 
after several seconds.
Then observed a crash (which the system subsequently recovered from): 
http://paste.ubuntu.com/25293751/
Then observed a deadlock with the same symptoms.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1710278

Title:
  [2.3a1] named stuck on reload, DNS broken

To manage notifications about this bug go to:
https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to