[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
The Eoan Ermine has reached end of life, so this bug will not be fixed for that release ** Changed in: bind9 (Ubuntu Eoan) Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: bind9 (Ubuntu Disco) Status: In Progress => Won't Fix -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: bind Status: New => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Release 2.4.3 debdiff. ** Description changed: - Am running an HA MAAS, but every few days named gets stuck on one of the - region controllers. + [Impact] - systemd thinks the service is running, but it doe not respond to any + * systemd thinks the service is running, but it does not respond to any commands or requests. Also, it doesn't respond to signals other than kill -9. service restarts hang, rndc hangs. - I have attached logs and a core dump of named. + * being that the deadlock is in bind9 that ships in bionic the issue + needs to be backported to 2.4.3 so MAAS from the archive in bionic can + handle the deadlock and get bind9 unstuck. + + * change in MAAS watches for this case to occur with bind9, then MAAS + will force kill the service and restart it. + + [Test Case] + + * very hard to reproduce but the issue occurs when bind9 deadlocks, it + response to nothing over the network or rndc. SIGTERM does not kill it + only SIGKILL works to force kill the process and get systemd to restart + it. + + + [Regression Potential] + + * possible that bind9 will not be started correctly or possible that + bind9 will be placed into a forever restart loop ** Patch added: "maas_2.4.3.debdiff" https://bugs.launchpad.net/maas/+bug/1710278/+attachment/5302068/+files/maas_2.4.3.debdiff -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas (Ubuntu Bionic) Status: Confirmed => In Progress ** Changed in: maas (Ubuntu Bionic) Assignee: (unassigned) => Blake Rouse (blake-rouse) ** Also affects: maas/2.4 Importance: Undecided Status: New ** Changed in: maas/2.4 Status: New => Fix Committed ** Changed in: maas/2.4 Milestone: None => 2.4.3 ** Changed in: maas/2.4 Assignee: (unassigned) => Blake Rouse (blake-rouse) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Merge proposal linked: https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/372692 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.6 Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Additionally, any idea as of when proposed 2.6.1 (including the dns reload fix) will become available in MAAS/stable ? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Additionally, any idea as of when 2.6 will become available in MAAS/stable ? -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
We are current working on getting this backported to 2.4. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Hi, I see that the backport fix is released and/or committed to MAAS 2.2, 2.6 and 2.7. Can we get it backported to 2.4 as well? It is currently affecting a customer in production. Thank you! -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: maas (Ubuntu Bionic) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Status changed to 'Confirmed' because the bug affects multiple users. ** Changed in: maas (Ubuntu) Status: New => Confirmed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Also affects: maas (Ubuntu) Importance: Undecided Status: New ** No longer affects: maas (Ubuntu Xenial) ** No longer affects: maas (Ubuntu Eoan) ** No longer affects: maas (Ubuntu Disco) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Merge proposal linked: https://code.launchpad.net/~ltrager/maas/+git/maas/+merge/371488 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.6 Status: New => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Merge proposal linked: https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/371228 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.7 Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Merge proposal linked: https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/371218 ** Changed in: maas/2.7 Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: bind9 (Ubuntu Xenial) Assignee: Eric Desrochers (slashd) => (unassigned) ** Changed in: bind9 (Ubuntu Bionic) Assignee: Eric Desrochers (slashd) => (unassigned) ** Changed in: bind9 (Ubuntu Disco) Assignee: Eric Desrochers (slashd) => (unassigned) ** Changed in: bind9 (Ubuntu Eoan) Assignee: Eric Desrochers (slashd) => (unassigned) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Due to the complexity of the work and various challenges I'm facing (fix one problem, find another one, fix it, find another one, and so on) during the process of having a single thread package of bind9. Not counting the complexity to maintain both multi-thread/single-thread package types until the deadlocks situation upstream is fixed, plus version 9.14 not offering single-thread (possibly landing in next release 19.10, and/or 20.04 (which will also be LTS)) If 19.10 and/or 20.04 has bind 9.14 or late and the deadlocks situation is not fix, we will have to find another approach anyway, as the single- thread will no longer be an option. The deadlocks situation will take time has IIRC they have to entirely refactor the locking mechanism inside bind9 (which is not trivial) so we may end up having to maintain the solution we take for quite some time. With these new parameters, maybe we should re-consider our approach. Possibly server team/MAAS team should take over at this point and have a cross team discussion about this situation ? - Eric -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
$ ldd /usr/sbin/named | grep -i thread libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x7f6495778000) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
ok next challenge, named single-thread coredump as following: # named -f -u bind named: ../nptl/pthread_mutex_lock.c:79: __pthread_mutex_lock: Assertion `mutex->__data.__owner == 0' failed. Aborted (core dumped) Although named is clearly single threaded # named -V BIND -Ubuntu (Extended Support Version) running on Linux x86_64 5.0.0-20-generic #21-Ubuntu SMP Mon Jun 24 09:32:09 UTC 2019 built by make with '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=/usr/include' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-silent-rules' '--libdir=/usr/lib/x86_64-linux-gnu' '--libexecdir=/usr/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--disable-dependency-tracking' '--libdir=/usr/lib/x86_64-linux-gnu' '--sysconfdir=/etc/bind' '--with-python=python3' '--localstatedir=/' '--disable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-gost=no' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-libjson=/usr' '--without-lmdb' '--with-gnu-ld' '--with-geoip=/usr' '--with-atf=no' '--enable-ipv6' '--enable-rrl' '--enable-filter-' '--disable-native-pkcs11' '--with-pkcs11=no' '--with-randomdev=/dev/urandom' '--with-eddsa=no' 'build_alias=x86_64-linux-gnu' 'CFLAGS=-g -O2 -fdebug-prefix-map=/build/bind9-MQj2Su/bind9-9.11.3+dfsg=. -fstack-protector-strong -Wformat -Werror=format-security -fno-strict-aliasing -fno-delete-null-pointer-checks -DNO_VERSION_DATE -DDIG_SIGCHASE' 'LDFLAGS=-Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-z,now' 'CPPFLAGS=-Wdate-time -D_FORTIFY_SOURCE=2' compiled by GCC 7.4.0 compiled with OpenSSL version: OpenSSL 1.1.1 11 Sep 2018 linked to OpenSSL version: OpenSSL 1.1.1 11 Sep 2018 compiled with libxml2 version: 2.9.4 linked to libxml2 version: 20904 compiled with libjson-c version: 0.12.1 linked to libjson-c version: 0.12.1 compiled with zlib version: 1.2.11 linked to zlib version: 1.2.11 threads support is disabled I'll build/publish the debug symbol and see what is missing (possibly libs built by the bind9 source need to be single-threaded too) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** important note *** For eventual newer version of bind upgrade into later Ubuntu. https://ftp.isc.org/isc/bind9/9.14.0/RELEASE-NOTES-bind-9.14.0.html Previously, it was possible to build BIND without thread support for old architectures and systems without threads support. BIND now requires threading support (either POSIX or Windows) from the operating system, and it cannot be built without threads. Seems like we can use single-thread up to 9.14, after that it's no longer offer as an configuration option, so until the upstream deadlocks are fix, and considering that --disable-threads is our current workaround/best options, we should not go beyond 9.14 until we have a better solution using multi threading and/or if upstream fix the deadlocks issues that MAAS suffers. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I just confirmed the above ^ : $ git checkout v9_13_0 $ ./configure --help | grep -i "enable-thread" --enable-threadsenable multithreading $ git checkout v9_14_0 $ ./configure --help | grep -i "enable-thread" ==> NO MORE OPTION. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Ok I made some progress bind9-single-thread is now "threads support is disabled" according to "named -V" and bind9 is still "threads support is enabled" still according to "named -V" There is a few things to fix still here and there due to the recent change, but at least I think I found what dh_install needed to separate the binaries into their respective binary package. It is a bit more difficult since it involves multiple recompilation of the same software but with different configuration options for each of them. Still need some time to refine the configuration and test again. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Which explain why multi thread binary pkg and single thread binary package have the same named binary. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
In the multiple binary package case, the files are instead installed into debian/tmp/, and should be moved from there to the appropriate package build directory using dh_install(1). >From debhelper compatibility level 7 on, dh_install will fall back to looking in debian/tmp for files, if it doesn't find them in the current directory (or wherever you've told it to look using --sourcedir). That is the problem: dh_auto_install -B build --destdir=$(CURDIR)/debian/tmp dh_auto_install -B build-singlethread --destdir=$(CURDIR)/debian/tmp-singlethread I have to specify the src dir inside dh_install, cause current setup only rely on debian/tmp, the default when multiples binary package are in place. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
The build works as expected, I think "dh_install" is where the named binary got overwritten because "-pPACKAGE" is not mentioned. Testing with "fakeroot debian/rules build" revealed: $ md5sum build/bin/named/named 9f9fad4761dccb84801351a32c8c1a4f build/bin/named/named $ md5sum build-singlethread/bin/named/named bbdff642ecbf573521c6143a7d21db15 build-singlethread/bin/named/named While installing bind9 or bind9-single-thread, named binary have the same md5sum for both. So definitely the build part is working as expected, but the installation/copy goes wrong. Hopefully that would do the trick: - dh_auto_install -B build --destdir=$(CURDIR)/debian/tmp - dh_auto_install -B build-singlethread --destdir=$(CURDIR)/debian/tmp-singlethread + dh_auto_install -pbind9 -B build --destdir=$(CURDIR)/debian/tmp + dh_auto_install -pbind9-single-thread -B build-singlethread --destdir=$(CURDIR)/debian/tmp-singlethread -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I have double-check in the buildlog and everything seems to indicate that multi-treading is turned off: # buildlog . 2689 dh_auto_configure -B build-singlethread -- \ 2690 --libdir=/usr/lib/x86_64-linux-gnu \ 2691 --sysconfdir=/etc/bind \ 2692 --with-python=python3 \ 2693 --localstatedir=/ \ => 2694 --disable-threads \ ... 3908 Features disabled or unavailable on this platform: => 3909 Multiprocessing support (--enable-threads) . Need to investigate more. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Please wait before testing my ppa [ppa:slashd/lp1710278] I was doing some test, and notice named is still multi-threaded although I pass the --disable-thread parameter. $ ls /proc/1791/task/ 1791 1793 1794 1795 1796 1797 1798 1799 1800 1801 1802 $ ps -fL -C named UIDPID PPID LWP C NLWP STIME TTY TIME CMD bind 1791 1 1791 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1793 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1794 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1795 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1796 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1797 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1798 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1799 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1800 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1801 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind bind 1791 1 1802 0 11 Aug01 ?00:00:00 /usr/sbin/named -f -u bind I'll fix that and produce a new binary package to test. - Eric -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
In Reply to Seth's suggestion: > Am I reading this bug correctly, that MAAS currently asks BIND to reload its > entire configure > file on every machine provision and removal? > > This seems like a problem worth solving rather than trying to work around. > > At least PowerDNS provides several mechanisms for dynamically adding and > removing records from > a zone: > > - dnsupdate: https://doc.powerdns.com/authoritative/dnsupdate.html [...] > Since dnsupdate is an RFC-standardized protocol there's a pretty good shot > BIND supports it as > well. Was this tried and found lacking? The API and SQL approaches are likely > to not have > equivalents in BIND. > > I'm not sure what your DNSSEC goals are, but PowerDNS's documentation > describes choices, > including pkcs#11 in case that's important: > https://doc.powerdns.com/authoritative/dnssec/index.html Yes bind has even a tool for RFC 2136 packaged [1]. A little howto mentioning DNSSEC in that regard can be found at [2]. It also mentions an apparmor Deny with the setup, but if that would be the blocker I'm sure we can come up with a safe rule that can be added. This might really be much closer to the design of the DNS server then high-frequency restart/reload. So giving this a thought/experiment on the MAAS side might be great. [1]: http://manpages.ubuntu.com/manpages/bionic/man1/nsupdate.1.html [2]: https://dnns.no/dynamic-dns-with-bind-and-nsupdate.html -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
As mentioned in previous comment, I had to turn off pkcsk11 # debian/rules: dh_auto_configure -B build-singlethread -- \ --disable-threads \ --disable-native-pkcs11 \ --with-pkcs11=no \ but --with-openssl has been preserve, I don't know yet how much this can impact DNSSEC (but we would definitely need to pay attention to that) # ./configure --help .. --with-openssl=PATH Build with OpenSSL [yes|no|path]. (Crypto is required for DNSSEC) .. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I have inverted the output accidentally by mistake: So to rectify: * IF bind9 is installed the expected behaviour is the following: # apt install bind9-single-thread .. The following packages will be REMOVED: bind9 The following NEW packages will be installed: bind9-single-thread * IF bind9-single-thread is installed the expected behaviour is the following: # apt install bind9 .. The following packages will be REMOVED: bind9-single-thread The following NEW packages will be installed: bind9 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
# Quick note before one want to test this package: * This is a test package for others to test, ONLY made to determine that it works as expected and fixes the current situation (pre-sru). * This is NOT a long term nor final solution yet, please wait until the package is found in the official Ubuntu archive before considering this official (Post-SRU) * DO NOT test on production area. * This package is subject to change as we progress. # PPA instructions: sudo add-apt-repository ppa:slashd/lp1710278 sudo apt-get update sudo apt install bind9-single-thread As I write this, the version is : 1:9.11.3+dfsg- 1ubuntu1.8+hfv20190801lp1710278b2 # Testing: * Test MAAS with DNSSEC on|off|automatic| with a significant amount of machines/VMs (at least 50 from what I heard/can tell) * Validate that only bind9 or bind9-single-thread can be installed at the time on the same system (basically both bind packages can't co-exist and be co-installable at the same time/on the same machine) - I already tested that part, and it did the trick for me but having more eyes on won't hurt. Expected behaviour: - If bind9 is installed and one tries to install bind9-single-thread " The following packages will be REMOVED: bind9-single-thread The following NEW packages will be installed: bind9 " - If bind9-single-thread is installed and one tries to install bind9: " The following packages will be REMOVED: bind9 The following NEW packages will be installed: bind9-single-thread " So that way it will conflicts, but won't block user to switch from one to another due to the "Replaces:" put in place, but users/package maintainer depending on one or the other will have to read carefully what apt will mentioned and know what they are doing here. If you think of anything else, please do so. The more testing/feedback the better. - Eric -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Ok I have something ready and installable to do further testing now. This is introducing a new binary package called "bind9-single-thread" If someone has something against the package name, please let me know, but "bind9-single-thread" is what I found the most obvious without looking at the description/changelog or else. That is pure esthetic so it can be changed at anytime before the SRU, so I'm all ears if someone come up with a better naming idea. * With the following "bind9" and "bind9-single-thread" can't co-exist on the same machine (not co-installable). To do so I have put in place 2 things as follow: # d/control: Package: bind9 Conflicts: bind9-single-thread Replaces: bind9-single-thread Package: bind9-single-thread Conflicts: bind9 Replaces: bind9 References: https://www.debian.org/doc/debian-policy/ch-relationships.html#s-conflicts # when two packages provide the same file and will continue to do so, https://www.debian.org/doc/debian-policy/ch-relationships.html#s-replaces Second, Replaces allows the packaging system to resolve which package should be removed when there is a conflict (see Conflicting binary packages - Conflicts). This usage only takes effect when the two packages do conflict, so that the two usages of this field do not interfere with each other. Next step is to test MAAS against "bind9-single-thread" with DNSSEC as it seems the way to trigger the deadlocks in bind9 and the reason why we are introducing a single-thread binary package. I'll share the PPA later today. - Eric -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Am I reading this bug correctly, that MAAS currently asks BIND to reload its entire configure file on every machine provision and removal? This seems like a problem worth solving rather than trying to work around. At least PowerDNS provides several mechanisms for dynamically adding and removing records from a zone: - dnsupdate: https://doc.powerdns.com/authoritative/dnsupdate.html - REST api: https://doc.powerdns.com/authoritative/http-api/index.html - direct SQL to a backing database: https://doc.powerdns.com/authoritative/migration.html Since dnsupdate is an RFC-standardized protocol there's a pretty good shot BIND supports it as well. Was this tried and found lacking? The API and SQL approaches are likely to not have equivalents in BIND. I'm not sure what your DNSSEC goals are, but PowerDNS's documentation describes choices, including pkcs#11 in case that's important: https://doc.powerdns.com/authoritative/dnssec/index.html Thanks -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
After reading more, I think what is most important for DNSSEC is --with-openssl. I don't quite get the goal of PKCS11 but it doesn't seem as important as I would first think, if I read this correctly. I'll try to get an installable bind9-single-thread binary package and test DNSSEC after. Reference: https://github.com/isc-projects/bind9/blob/master/README#L185-L191 # ./configure --help --with-openssl=PATH Build with OpenSSL [yes|no|path]. (Crypto is required for DNSSEC) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I think an external provider can be mentioned via '-E engine-name' (see: NAMED(8)) -E engine-name When applicable, specifies the hardware to use for cryptographic operations, such as a secure key store used for signing. When BIND is built with OpenSSL PKCS#11 support, this defaults to the string "pkcs11", which identifies an OpenSSL engine that can drive a cryptographic accelerator or hardware service module. When BIND is built with native PKCS#11 cryptography (--enable-native-pkcs11), it defaults to the path of the PKCS#11 provider library specified via "--with-pkcs11". I'll have a look our options once I have a binary pkg ready to be installed and tested. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I started to look for the singlethread new binary approach. So far it seems like bind9 won't be able to use pkcs11 library provider, thus I'm afraid there will be no DNSSEC capabilities in the singlethread binary package as it seems to require pthreads. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
For now I have to use the following to make the build works in # debian/rules: dh_auto_configure -B build-singlethread -- \ --disable-threads \ --disable-native-pkcs11 \ --with-pkcs11=no \ -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Tags added: sts ** Changed in: bind9 (Ubuntu Bionic) Assignee: Dan Streetman (ddstreet) => Eric Desrochers (slashd) ** Changed in: bind9 (Ubuntu Disco) Assignee: Dan Streetman (ddstreet) => Eric Desrochers (slashd) ** Changed in: bind9 (Ubuntu Eoan) Assignee: Dan Streetman (ddstreet) => Eric Desrochers (slashd) ** Changed in: bind9 (Ubuntu Xenial) Assignee: Dan Streetman (ddstreet) => Eric Desrochers (slashd) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Last resolution update: The bind9 source package should be modified to generate 2 binary versions of bind9. bind9 -> standard multi-threaded bind9 (main) bind9-single -> single-threaded bind9 (universe) This will allow security updates to still be handled with the source bind9 package generating both versions of the binary package. Once bind9-single is in the archive MAAS will update its dependencies to depend on either bind9 or bind9-single. Allowing bind9-single to be installed in replace of bind9 and MAAS will not try to pull the default bind9 when upgraded. Note: "bind9-single" is just a name I am using for this comment. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
> Looks like that was the last one for this particular bug. spoke too soon. Next deadlock is same place for thread A, dns_resolver_shutdown, while thread B is similar, fctx_create->dns_view_findzonecut->dns_view_findzonecut2 where it hangs on the view->lock. Really, dns_resolver_shutdown (which holds the view->lock and iterates through taking each of the view's bucket locks) and fctx_create (which requires the caller to hold the bucket lock, and calls lots of functions that take the view->lock) will continue to deadlock like this until upstream does large changes to locking, which is what they're doing: https://gitlab.isc.org/isc-projects/bind9/merge_requests/2132 However that's still WIP, and the changes so far are large (30 commits as of this comment): https://gitlab.isc.org/isc-projects/bind9/merge_requests/2132/commits So backporting all that may be outside the scope of normal SRUs. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Also affects: bind9 (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: bind9 (Ubuntu Xenial) Importance: Undecided => Medium ** Changed in: bind9 (Ubuntu Xenial) Status: New => In Progress ** Changed in: bind9 (Ubuntu Xenial) Assignee: (unassigned) => Dan Streetman (ddstreet) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Looks like that was the last one for this particular bug. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Another test build ready. https://launchpad.net/~ddstreet/+archive/ubuntu/lp1710278 Where will the next deadlock be? :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
first deadlock from comment 27 fixed/workedaround. Next deadlock is: thread A is the same as comment 27, holding view->lock and waiting for the bucket lock in dns_resolver_shutdown. now, thread B calls dispatch->authvalidated->nsecvalidate->create_fetch->dns_resolver_createfetch->dns_resolver_createfetch3 which does: LOCK(&res->buckets[bucketnum].lock); then calls fctx_create->fcount_incr which does: LOCK(&fctx->res->lock); where fctx->res is the view. So again, deadlock. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: bind9 (Ubuntu Eoan) Assignee: Blake Rouse (blake-rouse) => Dan Streetman (ddstreet) ** Changed in: bind9 (Ubuntu Disco) Assignee: (unassigned) => Dan Streetman (ddstreet) ** Changed in: bind9 (Ubuntu Bionic) Assignee: (unassigned) => Dan Streetman (ddstreet) ** Changed in: bind9 (Ubuntu Disco) Importance: Undecided => Medium ** Changed in: bind9 (Ubuntu Bionic) Importance: Undecided => Medium ** Changed in: bind9 (Ubuntu Eoan) Status: Triaged => In Progress ** Changed in: bind9 (Ubuntu Disco) Status: New => In Progress ** Changed in: bind9 (Ubuntu Bionic) Status: New => In Progress ** Changed in: bind9 (Ubuntu Eoan) Importance: High => Medium -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Ok, updated the test build in ppa with locking for view->attributes as well, should fix this particular bind9 deadlock. ** Also affects: bind9 (Ubuntu Disco) Importance: Undecided Status: New ** Also affects: bind9 (Ubuntu Eoan) Importance: High Assignee: Blake Rouse (blake-rouse) Status: Triaged ** Also affects: bind9 (Ubuntu Bionic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
> Test build in ppa: eh, there is still a view->attributes field that needs lock protection, this isn't ready yet. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Test build in ppa: https://launchpad.net/~ddstreet/+archive/ubuntu/lp1710278 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
https://gitlab.isc.org/isc-projects/bind9/issues/1148 ** Bug watch added: gitlab.isc.org/isc-projects/bind9/issues #1148 https://gitlab.isc.org/isc-projects/bind9/issues/1148 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
This deadlock doesn't appear to be fixed in the latest upstream -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
This is a deadlock in bind9 code; thread A runs ns_client_endrequest->dns_view_detach->view_flushanddetach, which includes: LOCK(&view->lock); if (!RESSHUTDOWN(view)) dns_resolver_shutdown(view->resolver); inside dns_resolver_shutdown, for each resolver bucket, it does: for (i = 0; i < res->nbuckets; i++) { LOCK(&res->buckets[i].lock); at this point, one of the bucket locks is held, and thread A is blocked holding view->lock, but waiting for the view->resolver->bucket[i].lock. meanwhile, thread B runs dispatch->validated, and does: bucketnum = fctx->bucketnum; LOCK(&res->buckets[bucketnum].lock); then while still holding that lock calls dns_validator_destroy->destroy->dns_view_weakdetach which does: LOCK(&view->lock); leaving thread A and thread B in a deadlock, with thread A waiting for the bucket.lock that thread B holds, and thread B waiting for the view->lock that thread A holds. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.7 Status: Fix Released => New ** Changed in: maas/2.6 Assignee: (unassigned) => Blake Rouse (blake-rouse) ** Changed in: maas/2.6 Milestone: None => 2.6.1 ** Changed in: maas/2.7 Milestone: 2.3.0alpha2 => 2.7.0alpha1 ** Changed in: bind9 (Ubuntu) Assignee: (unassigned) => Blake Rouse (blake-rouse) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Also affects: maas/2.6 Importance: Undecided Status: New ** Also affects: maas/2.7 Importance: Critical Assignee: Blake Rouse (blake-rouse) Status: Fix Released ** Changed in: maas/2.6 Importance: Undecided => Critical -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I'm not sure why a "broken" Upstream DNS helps repro this bug, but I was not able to repro when the Upstream DNS was working. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
repro.py attempts to trigger DNS queries during DNS Reloads. It does so by first deploying all 50 machines. Then one-by-one (not all at once!) release a machine, wait, deploy machine, move to next machine. At some point a machine will be releasing (Reloads) while others are starting to deploy (DNS Queries). This is the sweet spot. If one simply deploys all 50 machines simultaneously, then the DNS Reload would occur but without any DNS queries (because all machines have yet to PXE boot). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
repro.py attached ** Attachment added: "repro.py" https://bugs.launchpad.net/ubuntu/+source/bind9/+bug/1710278/+attachment/5276146/+files/repro.py -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
OK - I was able to repro again, and this time with MAAS 2.6. Here are the steps PREP WORK 1) Have 50 machines in Ready state with one interface enabled configured as 'Autoassign' to Default VLAN PXE subnet (auto assign so that every deploy/release causes MAAS to reload DNS) 2) Clear out any DNS entries in the PXE subnet (this forces nodes to send DNS queries to MAAS) 3) Settings-> Network Services -> DNS -> Upstream DNS -> enter valid upstream DNS IP 4) Settings-> Network Services -> DNS -> DNSSEC -> Automatic (for some reason this breaks Upstream DNS) 5) Verify that Upstream DNS is broken a) Rescue Mode one machine b) ssh to Rescue machine c) dig www.google.com d) (dig should timeout/fail) e) MAAS->Settings-> Network Services -> DNS -> DNSSEC -> Disable f) dig www.google.com g) (dig should succeed) h) MAAS->Settings-> Network Services -> DNS -> DNSSEC -> Automatic i) Release Rescue machine REPRO 1) run repro.py (attached, WARNING this code will use all machines available to MAAS) 2) wait up to 3 hours, checking if bind9 is hung by regularly running `sudo rndc status` on MAAS monitoring steps (optional) (See DNS Query activity) in one ssh window to Maas run sudo tcpdump dst -i ens3 and dst port 53 (See DNS reloads, and why) in another ssh window to Maas run sudo tail -f /var/log/maas/regiond.log |grep Reloaded -A 3 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
OK, interesting. I really don't like the reloading strategy but am not sure that BIND gives us many better options. Let us know what you find. Mark -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Hi Mark, Still seeing it with 18.04 and 2.6. The sweet spot seems to be when MAAS is receiving lots of DNS requests while simultaneously doing DNS reloads (as you alluded to in this case). I'm attempting to setup a simplified repro scenario which basically will do this: 1) enlist 50+ new machines on a untagged subnet *with DNS left blank* forcing nodes to DNS query MAAS 2) Leave machines PXE interface with Autoassign IP (so every deploy/releaes forces a DNS reload) 3) deploy and release (repeat until error) will report back with findings. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
On 6/26/19 5:12 PM, Sam Lee wrote: > When I compare our v2.5.3 install from our v2.4.2 install, the amount of > rndc reloads is vastly more on v2.5.3. Hi Sam, I don't see these issues any more, on 18.04 and 2.6. I see reloads every few minutes on a stable MAAS (i.e. without a lot of activity). Since 2.6 is brand new (2.6.0) you might want to hold off on upgrading unless your cluster is for test purposes, but let us know if you still see this on 2.6 when you get there. Mark -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Mark, Do you have any updated repro steps? I'm seeing this failure with MAAS v2.5.3. I suspect when v2.5 moved the DNS logic from region to rack controller, that some of the mitigation logic was lost and thus this bug manifests more frequently. When I compare our v2.5.3 install from our v2.4.2 install, the amount of rndc reloads is vastly more on v2.5.3. [2.4.2] journalctl -b -u bind9.service |grep received.control Jun 22 00:22:05 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 22 00:22:08 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 22 00:22:54 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 24 16:27:06 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 25 13:53:34 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 25 13:53:41 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 25 13:54:51 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' Jun 25 13:55:22 wdc1-p01-s01-maas-18 named[907]: received control channel command 'reload' [2.5.3] journalctl -b -u bind9.service |grep received.control Jun 26 14:23:59 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:04 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:09 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:11 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:15 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:18 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:22 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:27 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:31 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:36 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:40 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' Jun 26 14:24:42 ch31-p01-s01-maas-18 named[1041]: received control channel command 'reload' I had to trim the 2.5.3 output because it was way too long to fit in this comment, but as you can see 2.5.3 is spamming reload as compared to 2.4.2. 2.4.2 it may reload 4 times for the _entire day_ whereas 2.5.3 is doing hundreds if not thousands a day. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Hey Mark, was cleaning up bug tags; still consider this an issue. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Hold on, I think this bug is still problematic for MAAS and Ubuntu. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Tags removed: server-next -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.2 Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.2 Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Merge proposal linked: https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/329366 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas/2.2 Status: New => In Progress -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas Milestone: 2.3.0 => 2.3.0alpha2 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Thanks. I wonder if we shouldn't fix this in Ubuntu by tweaking the systemd control files so that the timeout values are more acceptable for production use. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
OK, restarting via 'sudo service bind9 restart' does work in the end, it just takes a long time. The downside is that MAAS is not going to have an effective DNS for a few minutes, which is unacceptable. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: maas Status: Triaged => In Progress ** Changed in: maas Assignee: (unassigned) => Blake Rouse (blake-rouse) ** Changed in: maas Milestone: None => 2.3.0 ** Also affects: maas/2.2 Importance: Undecided Status: New ** Changed in: maas/2.2 Importance: Undecided => Critical ** Changed in: maas/2.2 Assignee: (unassigned) => Blake Rouse (blake-rouse) ** Changed in: maas/2.2 Milestone: None => 2.2.3 ** Merge proposal linked: https://code.launchpad.net/~blake-rouse/maas/+git/maas/+merge/329260 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
OK, will do, thanks :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Mark, if you observe the deadlock again, can you run "systemctl stop bind9", wait a few minutes (at least 2, but maybe up to 5), and then check if bind9 successfully stops? It looks like systemd will (by default) resort to more aggressive methods to kill a service if it doesn't stop after ~90 seconds. If the normal method of killing the bind9 service works, we can still avoid adding that scope and risk to MAAS. Rather, if we detect bind9 behaving badly, a stop/start cycle would also allow bind9 to properly shut down in most cases, and avoid any other bugs in BIND we might see as a side-effect of a "kill -9 " approach. (A human operator could troubleshoot those side effects, but it's more difficult for MAAS to anticipate, for example, why BIND might now fail to start up because of a lock file that was left on the filesystem when the 'kill -9' occurred.) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Maybe dns dynamic updates could be used instead of zone reloads if just a few IPs were added or removed. On Aug 12, 2017 06:22, "Mark Shuttleworth" <1710...@bugs.launchpad.net> wrote: > To avoid reloads in parallel, I think we should: > > * verify the reload happened (perhaps checking zone serial?) > * make sure we defer and subsequent reload at least 10 seconds > > Mark > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1710278 > > Title: > [2.3a1] named stuck on reload, DNS broken > > To manage notifications about this bug go to: > https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
To avoid reloads in parallel, I think we should: * verify the reload happened (perhaps checking zone serial?) * make sure we defer and subsequent reload at least 10 seconds Mark -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
On 12/08/17 01:11, Mike Pontillo wrote: > Finally, I think your last bullet requires more discussion before we can > work on it. MAAS currently uses sudoers rules specific to the init > system to start and stop services like bind9; we do not currently have > permission to 'kill -9' arbitrary processes. I'm concerned that if we go > down that road, we would open up the possibility that MAAS could > erroneously (or due to a malicious attack) believe that bind9 isn't > working and repeatedly kill it without good cause, or be convinced to > 'kill -9' an incorrect process. This bug causes named to be unresponsive to anything other than kill -9. MAAS installed, configured, started, and validates named's behaviour. Assume there is no operator. Since kill -9 is necessary on occasion, it follows that MAAS must have and must use that ability. I could see MAAS trying it a few times and then giving up with a big alert to the operators. But I absolutely think MAAS should treat this as a bug in named which should be logged and managed nicely but nonetheless handled transparently to users. Mark -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I attempted to reproduce the bind9 issue by doing the following (in two separate sessions): # Queue 10,000 concurrent reloads (also tried removing the & to make it less parallel) i=0; while [ $i -lt 1 ]; do (/usr/sbin/rndc reload&); let i=$i+1; done # Hammer the DNS server with queries while [ 1 ]; do dig @127.0.0.1 ; done Everything works properly when I do this by itself. But if I have parallel reload requests running *and* I make manual changes to the DNS zones in /etc/bind/maas, I have observed bind9 behaving badly, including (eventually) what seemed to be the deadlock (but my bind9 was older, so my debug symbols didn't match).[1] Then I observed a similar state where after I updated the zone file, it was as if nothing changed (bind9 was returning old data, which didn't resolve itself until I did "service bind9 restart"). It's my impression that the problem is worse when I do reloads in parallel. So this is more evidence pointing to "we should ensure MAAS never tries to reload bind9 twice in parallel". [1]: First observed extreme sluggishness in resolving queries, which resolved itself after several seconds. Then observed a crash (which the system subsequently recovered from): http://paste.ubuntu.com/25293751/ Then observed a deadlock with the same symptoms. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
I'm +1 on throttling reloads; I think that is the most obvious and critical work item for the MAAS team to address. I have filed that as bug #1710308. I'm also +1 on better service monitoring using actual queries; I've filed that as bug #1710310. I think something equivalent to 'dig @127.0.0.1 ' on the region should be enough to detect a deadlock condition, but I like the idea of monitoring it from the rack's perspective as well (though that feels more like a non-fatal warning, because we don't want to restart bind in the event of random firewall hiccups). Finally, I think your last bullet requires more discussion before we can work on it. MAAS currently uses sudoers rules specific to the init system to start and stop services like bind9; we do not currently have permission to 'kill -9' arbitrary processes. I'm concerned that if we go down that road, we would open up the possibility that MAAS could erroneously (or due to a malicious attack) believe that bind9 isn't working and repeatedly kill it without good cause, or be convinced to 'kill -9' an incorrect process. In summary, I think the most urgent thing for MAAS to do is throttle reloads. That should greatly reduce the window of opportunity for the deadlock to occur. In parallel, this should be addressed upstream in bind9. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
In MAAS, we should: * throttle reloads (at least make sure a reload is complete before we trigger the next one) * monitor the actual service from the perspective of rackd's (perhaps have rackd's do a dig @region-controller for a name we send them whenever they talk to the region controller) * log loudly, kill and restart when the service monitoring fails Mark -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Changed in: bind9 (Ubuntu) Status: New => Triaged ** Changed in: bind9 (Ubuntu) Importance: Undecided => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Right; to attempt to reproduce the issue, I would aggressively reload (changing the zone files each time) while at the same time sending a large amount of queries to the server (for records in locally authoritative zones?). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
named is being asked to reload its zones quite frequently, sometimes within the same second: Aug 11 16:31:08 maas named[3174]: received control channel command 'reload' Aug 11 16:31:17 maas named[3174]: received control channel command 'reload' Aug 11 16:31:18 maas named[3174]: received control channel command 'reload' Aug 11 16:31:22 maas named[3174]: received control channel command 'reload' Aug 11 16:31:26 maas named[3174]: received control channel command 'reload' Aug 11 16:31:29 maas named[3174]: received control channel command 'reload' Aug 11 16:31:30 maas named[3174]: received control channel command 'reload' (...) Aug 11 17:07:35 maas named[3174]: received control channel command 'reload' Aug 11 17:07:35 maas named[3174]: received control channel command 'reload' Eventually it gets stuck: Aug 11 17:15:16 maas named[3174]: received control channel command 'reload' Aug 11 17:15:16 maas named[3174]: loading configuration from '/etc/bind/named.conf' Aug 11 17:15:16 maas named[3174]: reading built-in trusted keys from file '/etc/bind/bind.keys' An idea to try to reproduce this would be to issue such aggressive reloads on a multi-core machine. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
** Tags added: server-next -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
This is technically Invalid for MAAS unless there is something unsupported about how we're using BIND, but I'm marking it Triaged for now so we don't lose visibility (in case a fix in MAAS itself turns out to be required). It would be nice if the service monitoring in MAAS detected this condition, but that feels like it should be handled in a separate bug. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1710278 Title: [2.3a1] named stuck on reload, DNS broken To manage notifications about this bug go to: https://bugs.launchpad.net/bind/+bug/1710278/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1710278] Re: [2.3a1] named stuck on reload, DNS broken
Assuming the debug symbols I grabbed[1] for my install of bind9 on Xenial match yours (I have bind9 version 1:9.10.3.dfsg.P4-8ubuntu1.7 installed per "apt-cache policy bind9"), I did the following to grab a traceback: $ sudo apt-get install bind9-dbgsym libdns162-dbgsym libisc160-dbgsym $ gdb /usr/sbin/named core (gdb) set pagination off (gdb) thread apply all bt ... [2] ... Looking at the backtrace in [2], the interesting parts to me are threads 8, 11 and 20, which are possibly involved in a deadlock[3]. Looks like one of the threads is reloading the configuration (something we would expect MAAS to do), and the other is calling dns_resolver_shutdown() via view_flushanddetach(). [1]: https://wiki.ubuntu.com/Debug%20Symbol%20Packages [2]: http://paste.ubuntu.com/25292729/ [3]: Thread 8 (Thread 0x7f95226aa700 (LWP 3203)): #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x7f952a351efe in __GI___pthread_mutex_lock (mutex=mutex@entry=0x7f94d4014fe8) at ../nptl/pthread_mutex_lock.c:135 #2 0x7f952b7a0794 in dns_view_weakdetach (viewp=viewp@entry=0x7f9504389780) at ../../../lib/dns/view.c:597 #3 0x7f952b7993de in destroy (val=0x7f9504389750) at ../../../lib/dns/validator.c:3891 #4 0x7f952b79927b in dns_validator_destroy (validatorp=validatorp@entry=0x7f9519462628) at ../../../lib/dns/validator.c:3915 #5 0x7f952b76b9d1 in validated (task=, event=0x7f95194625d0) at ../../../lib/dns/resolver.c:4722 #6 0x7f952a9a6360 in dispatch (manager=0x7f952be3b010) at ../../../lib/isc/task.c:1130 #7 run (uap=0x7f952be3b010) at ../../../lib/isc/task.c:1302 #8 0x7f952a34f6ba in start_thread (arg=0x7f95226aa700) at pthread_create.c:333 #9 0x7f9529a993dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 11 (Thread 0x7f9520ea7700 (LWP 3206)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x7f952a9a516b in isc__task_beginexclusive (task0=) at ../../../lib/isc/task.c:1717 #2 0x557c34997dc1 in load_configuration (filename=, server=server@entry=0x7f952be44010, first_time=first_time@entry=isc_boolean_false) at ../../../bin/named/server.c:5651 #3 0x557c3499a826 in loadconfig (server=0x7f952be44010) at ../../../bin/named/server.c:7162 #4 0x557c3499ad48 in reload (server=0x7f952be44010) at ../../../bin/named/server.c:7183 #5 ns_server_reloadcommand (server=0x7f952be44010, args=args@entry=0x7f94fc120af0 "reload", text=text@entry=0x7f9520ea6590) at ../../../bin/named/server.c:7416 #6 0x557c34975db5 in ns_control_docommand (message=, text=text@entry=0x7f9520ea6590) at ../../../bin/named/control.c:102 #7 0x557c34978b97 in control_recvmessage (task=0x7f952be51010, event=) at ../../../bin/named/controlconf.c:458 #8 0x7f952a9a6360 in dispatch (manager=0x7f952be3b010) at ../../../lib/isc/task.c:1130 #9 run (uap=0x7f952be3b010) at ../../../lib/isc/task.c:1302 #10 0x7f952a34f6ba in start_thread (arg=0x7f9520ea7700) at pthread_create.c:333 #11 0x7f9529a993dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 20 (Thread 0x7f951c69e700 (LWP 3215)): #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x7f952a351efe in __GI___pthread_mutex_lock (mutex=0x7f94d43429f8) at ../nptl/pthread_mutex_lock.c:135 #2 0x7f952b7642a7 in dns_resolver_shutdown (res=0x7f952be56690) at ../../../lib/dns/resolver.c:9035 #3 0x7f952b7a06a1 in view_flushanddetach (viewp=viewp@entry=0x7f94fc02cc30, flush=flush@entry=isc_boolean_false) at ../../../lib/dns/view.c:508 #4 0x7f952b7a0757 in dns_view_detach (viewp=viewp@entry=0x7f94fc02cc30) at ../../../lib/dns/view.c:557 #5 0x557c3496e55b in ns_client_endrequest (client=0x7f94fc02cbe0) at ../../../bin/named/client.c:694 #6 exit_check (client=0x7f94fc02cbe0) at ../../../bin/named/client.c:382 #7 0x557c34970150 in ns_client_detach (clientp=clientp@entry=0x7f951c69ce18) at ../../../bin/named/client.c:2833 #8 0x557c34989ee2 in query_find (client=0x0, event=0x0, event@entry=0x7f9518a862b0, qtype=, qtype@entry=0) at ../../../bin/named/query.c:8328 #9 0x557c349921cf in query_resume (task=, event=0x7f9518a862b0) at ../../../bin/named/query.c:3826 #10 0x7f952a9a6360 in dispatch (manager=0x7f952be3b010) at ../../../lib/isc/task.c:1130 #11 run (uap=0x7f952be3b010) at ../../../lib/isc/task.c:1302 #12 0x7f952a34f6ba in start_thread (arg=0x7f951c69e700) at pthread_create.c:333 #13 0x7f9529a993dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 ** Also affects: bind Importance: Undecided Status: New ** Also affects: bind9 (Ubuntu) Importance: Undecided Status: New ** Changed in: maas Status: New => Triaged ** Changed in: maas Importance: Undecided => Critical -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. http