I attempted to reproduce the bind9 issue by doing the following (in two separate sessions):
# Queue 10,000 concurrent reloads (also tried removing the & to make it less parallel) i=0; while [ $i -lt 10000 ]; do (/usr/sbin/rndc reload&); let i=$i+1; done # Hammer the DNS server with queries while [ 1 ]; do dig @127.0.0.1 <maas-hostname>; done Everything works properly when I do this by itself. But if I have parallel reload requests running *and* I make manual changes to the DNS zones in /etc/bind/maas, I have observed bind9 behaving badly, including (eventually) what seemed to be the deadlock (but my bind9 was older, so my debug symbols didn't match).[1] Then I observed a similar state where after I updated the zone file, it was as if nothing changed (bind9 was returning old data, which didn't resolve itself until I did "service bind9 restart"). It's my impression that the problem is worse when I do reloads in parallel. So this is more evidence pointing to "we should ensure MAAS never tries to reload bind9 twice in parallel". [1]: First observed extreme sluggishness in resolving queries, which resolved itself after several seconds. Then observed a crash (which the system subsequently recovered from): http://paste.ubuntu.com/25293751/ Then observed a deadlock with the same symptoms. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs