High IOWAIT when running multiple rndc addzone / delzone causing dropped queries
Hello, I was wondering if someone on this list can assist me in figuring this out. I am trying to run the rndc addzone / delzone for many domains at once on a set of name servers. When this is done the the load on the box goes very high, and the process just slows right down to a halt (dropping queries). I am basically wondering if there are certain settings that I can change in order to run BIND more efficiently. Linux kernel: 3.1.10 x86_64 Intel(R) Xeon(R) CPU E3110 @ 3.00GHz GenuineIntel GNU/Linux At the time when I run the rndc addzone / delzone commands I will see this: Cpu(s): 0.3%us, 1.8%sy, 0.0%ni, 9.7%id, 88.2%wa, 0.0%hi, 0.0%si, 0.0%st I believe I have given named enough possible file handlers. # lsof -n | grep named | wc -l 1232 # su - named named@b1123 / $ ulimit -Hn 65536 named@b1123 / $ ulimit -Sn 65536 When I strace the different threads I notice that one thread is constantly redoing the 3bf305731dd26307.nzf and reading the JNL files. Another thread is just spewing out as fast as possible the following: futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931537, {1361978981, 779057000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931539, {1361978981, 89129}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931541, {1361978982, 44187000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931543, {1361978982, 100744000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931545, {1361978982, 266914000}, ) = 0 futex(0x7ff72930e028, FUTEX_WAIT_PRIVATE, 2, NULL) = 0 futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931547, {1361978982, 186681000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931549, {1361978982, 206752000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931551, {1361978982, 226846000}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931553, {1361978982, 24694}, ) = -1 ETIMEDOUT (Connection timed out) futex(0x7ff72930e028, FUTEX_WAKE_PRIVATE, 1) = 0 futex(0x7ff72930e07c, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 677931555, {1361978982, 266914000}, ) = -1 ETIMEDOUT (Connection timed out) Please let me know what you may think I need to do. Thank you! -Ted ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone: some questions
Evan, On Sun Jan 27 2013 at 00:10:28 CET, Evan Hunt wrote: > Delzone just means delete the zone from named, not delete the zone file > from the filesystem. (And I reckon we can do a good deal more harm by > deleting files you wanted to keep than by leaving files for you to delete > yourself...) What named giveth named may taketh :) I understand your reasoning. I just wanted to avoid writing a cleanup process. -JP ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone: some questions
> 1. Is named 'deaf' during an `rndc addzone'? I don't think so, but I'm >finding it hard to determine definitely. I'm primarily concerned with >named being able to handle any NOTIFYs it gets. The addzone task (like several other rndc commands) will temporarily acquire exclusive control of the named process so nothing else can happen at the same time. I confess I don't know whether notifies that arrive during this window would be dropped or queued... but my guess is dropped. > 2. When I `rndc addzone ... type "slave"; ...' named immediately picks >that up, transfers the zone and creates the specified file. However, >`rndc delzone', while it drops the zone from named, does not remove >the zone file from the file system. Is that a bug or was that >implemented intentionally? > >It seems a bit illogical to me that the zone file isn't removed from >the file system, but perhaps I'm interpreting 'delzone' too strongly? :) Delzone just means delete the zone from named, not delete the zone file from the filesystem. (And I reckon we can do a good deal more harm by deleting files you wanted to keep than by leaving files for you to delete yourself...) > 3. If I direct `rndc addzone|delzone' to the same named instance from >multiple processes (from the same source IP address), is there any >danger of the .nzf file being corrupted? No. (Or, if so, it would be a serious flaw, and I haven't seen any bug reports about that.) -- Evan Hunt -- e...@isc.org Internet Systems Consortium, Inc. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
rndc addzone|delzone: some questions
Hello, we have a few BIND (9.9) slave servers, each slaving a couple of hundred thousand small zones (a dozen records in each). A file included into named.conf is periodically generated from a database, and named is reconfigured (rndc reconfig) to load new slave zones. I'm considering replacing this scheme of doing things by calls to `rndc addzone' to add the slave zone to named on the fly, because we're seeing NOTIFYs going unanswered (for existing zones) while BIND is reloading. I'd appreciate if you could help me clarify a few things, please. 1. Is named 'deaf' during an `rndc addzone'? I don't think so, but I'm finding it hard to determine definitely. I'm primarily concerned with named being able to handle any NOTIFYs it gets. 2. When I `rndc addzone ... type "slave"; ...' named immediately picks that up, transfers the zone and creates the specified file. However, `rndc delzone', while it drops the zone from named, does not remove the zone file from the file system. Is that a bug or was that implemented intentionally? It seems a bit illogical to me that the zone file isn't removed from the file system, but perhaps I'm interpreting 'delzone' too strongly? :) 3. If I direct `rndc addzone|delzone' to the same named instance from multiple processes (from the same source IP address), is there any danger of the .nzf file being corrupted? Thank you for your time. Regards, -JP ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone
On Sun, 01 Jan 2012 13:39:21 +0100, "Carsten Strotmann (private)" wrote: This is an improvement over the situation without rndc addzone, where one had to use some kind of remote access to change the named.conf on the secondaries. I have been doing this way with my dns systems. Thanks. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 1/1/12 1:18 PM, DNSbed.com wrote: > On Sun, 1 Jan 2012 13:05:41 +0100, Jan-Piet Mens > wrote: >>> Has anyone tried the new features of rndc addzone|delzone with >>> BIND-9.7? Will the zone added|deleted get transfered between >>> master and slaves? >> >> No, the newly added (or deleted) zone will not be automatcially >> added to (deleted from) slave servers. (Slaves require a >> different zone definition containing at least the master >> servers.) >> > > Thanks for the info. If the result can't be transfered between > master/slaves, I doubt it has the practical use. It can be used in scripting solutions on a hidden master, for example: 1) script creates new master zone file and named.conf "zone" definition, reloads hidden master DNS 2) script uses rndc addzone to add the new zone to all secondary/slave servers This is an improvement over the situation without rndc addzone, where one had to use some kind of remote access to change the named.conf on the secondaries. - -- Carsten Happy New Year -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.17 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk8AU/kACgkQsUJ3c+pomYEfsgCdEQ5vnIsDl5eVvToUmzJM2c0d +PoAoJPjuNFOWBYXlQxie5N9irGjsycd =dPbN -END PGP SIGNATURE- ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone
On Sun, 1 Jan 2012 13:05:41 +0100, Jan-Piet Mens wrote: Has anyone tried the new features of rndc addzone|delzone with BIND-9.7? Will the zone added|deleted get transfered between master and slaves? No, the newly added (or deleted) zone will not be automatcially added to (deleted from) slave servers. (Slaves require a different zone definition containing at least the master servers.) Thanks for the info. If the result can't be transfered between master/slaves, I doubt it has the practical use. Regards. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone|delzone
> Has anyone tried the new features of rndc addzone|delzone with > BIND-9.7? > Will the zone added|deleted get transfered between master and slaves? No, the newly added (or deleted) zone will not be automatcially added to (deleted from) slave servers. (Slaves require a different zone definition containing at least the master servers.) -JP ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
rndc addzone|delzone
Hello, Has anyone tried the new features of rndc addzone|delzone with BIND-9.7? Will the zone added|deleted get transfered between master and slaves? Thanks. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: rndc addzone/delzone in 9.7.2rc1 (was: rndc reconfig delays)
Seems to me that if you stick with this, a couple of things are necessary for manageability: o Some command to translate a zone file name to a view/zone name, and vice-versa. That would enable people to debug based on file contents... o A method to migrate zones from today's 'named.conf-configured' to 'named-managed'. I think this needs to be scalable to Rob's 10k* zones. Perhaps a migration renames a zone file to the new scheme, and writes a stub file with a magic token in a file with the old name to tell named to ignore the named.conf entry and look for the new file? This way, named.conf can be cleaned of the old entries at leisure... o And, as I think I mentioned before, I'd really prefer to see this function added to the RFC2136 protocol than added under rndc. Rndc is not easy to automate reliably (as Rob notes). And of course it will drive similar non-standardized approaches in the other nameservers - which is a hassle for management tools. If you stick with rndc as the mechanism, I'd at least like to see a perl library that talks the rncd protocol and provides reliable communciations and useful status. (Of course if 2136 were used, extending Net::DNS (::SEC) would make this easier.) I have always managed my zones as dynamic - and I think DNSSEC will drive many others to do the same. I'm all in favor of making it possible to add/delete zones dynamically - but it has to be possible to mange/troubleshoot the result. (Other interesting operations are 'rename', and perhaps 'copy') - This communication may not represent my employer's views, if any, on the matters discussed. -Original Message- From: Rob Foehl [mailto:r...@loonybin.net] Sent: Friday, August 27, 2010 18:46 To: Evan Hunt Cc: bind-users@lists.isc.org Subject: Re: rndc addzone/delzone in 9.7.2rc1 (was: rndc reconfig delays) On Fri, 27 Aug 2010, Evan Hunt wrote: > "Non-obvious" isn't the point. We thought of having the file be named > directly after the view, but view names are allowed to include > characters that are forbidden in file names. Before opening the file > we'd have to check the name's legality, ensure it doesn't include > "../" at the beginng, etc. Rather than deal with that, I decided to > just hash the view name, and get a guaranteed-unique, guaranteed-legal filename for each view. How does this compare with the defaults for, say, the managed keys zones for each view? In any case, 3bf305731dd26307.nzf isn't obvious, having more than one configured view will make troubleshooting more difficult for the uninitiated, and something like dynamic-zones.conf.viewname (where 'viewname' is a sanitized version of such -- say all non-alphanumerics replaced with underscores or dashes) should be simple enough. > We needed a unique filename for each view because views can't share > new-zone files. (In the prior version, this wasn't explicitly > disallowed, but it caused big ugly failure modes if you tried it.) Shouldn't named explicitly check for overlap, then? That seems in line with many of the other sanity checks named does during normal operation... >> Why take away the ability to remove arbitrary zones from the current >> configuration? > > There are two parts to removing a zone: removing it from the currently > running server, and removing it from the configuration file so that it > doesn't come back when you restart. > > The second part can only be done with zones that are in the new-zone file. > (You wouldn't want named to be directly editing named.conf.) > > If you haven't done the second part, then the zone isn't really > "removed", just temporarily disabled. I felt that if we can't do both > parts, we shouldn't do the first. If you have a strong argument > otherwise, though, I'm listening... I have a process that implements very careful zone configuration management and bulk zone updates, which currently triggers per-zone rndc reloads for existing zones followed by an rndc reconfig if zones have been added or removed. The problem I've run into is that rndc reconfig is intolerably slow past 50,000 or so configured zones, and I'm trying to determine whether addzone/delzone would be a viable option. So, I explicitly don't want named to be managing the config. Changing the current server state without touching a config would be a drop-in change here, whereas having named manage the config removes most of the visibility I have into whether or not changes were successful. The boolean error status available from rndc is insufficiently robust for this purpose, unfortunately; my process makes a number of decisions about whether or not it should retry an ope
Re: rndc addzone/delzone in 9.7.2rc1 (was: rndc reconfig delays)
On Fri, 27 Aug 2010, Evan Hunt wrote: "Non-obvious" isn't the point. We thought of having the file be named directly after the view, but view names are allowed to include characters that are forbidden in file names. Before opening the file we'd have to check the name's legality, ensure it doesn't include "../" at the beginng, etc. Rather than deal with that, I decided to just hash the view name, and get a guaranteed-unique, guaranteed-legal filename for each view. How does this compare with the defaults for, say, the managed keys zones for each view? In any case, 3bf305731dd26307.nzf isn't obvious, having more than one configured view will make troubleshooting more difficult for the uninitiated, and something like dynamic-zones.conf.viewname (where 'viewname' is a sanitized version of such -- say all non-alphanumerics replaced with underscores or dashes) should be simple enough. We needed a unique filename for each view because views can't share new-zone files. (In the prior version, this wasn't explicitly disallowed, but it caused big ugly failure modes if you tried it.) Shouldn't named explicitly check for overlap, then? That seems in line with many of the other sanity checks named does during normal operation... Why take away the ability to remove arbitrary zones from the current configuration? There are two parts to removing a zone: removing it from the currently running server, and removing it from the configuration file so that it doesn't come back when you restart. The second part can only be done with zones that are in the new-zone file. (You wouldn't want named to be directly editing named.conf.) If you haven't done the second part, then the zone isn't really "removed", just temporarily disabled. I felt that if we can't do both parts, we shouldn't do the first. If you have a strong argument otherwise, though, I'm listening... I have a process that implements very careful zone configuration management and bulk zone updates, which currently triggers per-zone rndc reloads for existing zones followed by an rndc reconfig if zones have been added or removed. The problem I've run into is that rndc reconfig is intolerably slow past 50,000 or so configured zones, and I'm trying to determine whether addzone/delzone would be a viable option. So, I explicitly don't want named to be managing the config. Changing the current server state without touching a config would be a drop-in change here, whereas having named manage the config removes most of the visibility I have into whether or not changes were successful. The boolean error status available from rndc is insufficiently robust for this purpose, unfortunately; my process makes a number of decisions about whether or not it should retry an operation based on how it failed. Of course, none of this would matter if reconfig wasn't a problem with this many zones, so I'm still interested in that question too... :) -Rob ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: rndc addzone/delzone in 9.7.2rc1 (was: rndc reconfig delays)
> I'm having a hard time following the motivation behind these changes. Why > is the filename non-configurable and non-obvious? "Non-configurable" may change. "Non-obvious" isn't the point. We thought of having the file be named directly after the view, but view names are allowed to include characters that are forbidden in file names. Before opening the file we'd have to check the name's legality, ensure it doesn't include "../" at the beginng, etc. Rather than deal with that, I decided to just hash the view name, and get a guaranteed-unique, guaranteed-legal filename for each view. We needed a unique filename for each view because views can't share new-zone files. (In the prior version, this wasn't explicitly disallowed, but it caused big ugly failure modes if you tried it.) > Why take away the ability to remove arbitrary zones from the current > configuration? There are two parts to removing a zone: removing it from the currently running server, and removing it from the configuration file so that it doesn't come back when you restart. The second part can only be done with zones that are in the new-zone file. (You wouldn't want named to be directly editing named.conf.) If you haven't done the second part, then the zone isn't really "removed", just temporarily disabled. I felt that if we can't do both parts, we shouldn't do the first. If you have a strong argument otherwise, though, I'm listening... -- Evan Hunt -- e...@isc.org Internet Systems Consortium, Inc. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
rndc addzone/delzone in 9.7.2rc1 (was: rndc reconfig delays)
On Thu, 26 Aug 2010, Rob Foehl wrote: My next step is going to be to experiment with the rndc addzone/delzone feature in the 9.7.2 betas, which hopefully should avoid any need to attempt a reconfig during normal use. That aside, is there anything else I could be doing to speed things up? I suppose it's fortuitous that 9.7.2rc1 was released so shortly after I'd written the above; what may have been an ideal solution is no longer so with this change: 2936. [func] Improved configuration syntax and multiple-view support for addzone/delzone feature (see change #2930). Removed "new-zone-file" option, replaced with "allow-new-zones (yes|no)". The new-zone-file for each view is now created automatically, with a filename generated from a hash of the view name. It is no longer necessary to "include" the new-zone-file in named.conf; this happens automatically. Zones that were not added via "rndc addzone" can no longer be removed with "rndc delzone". [RT #19447] I'm having a hard time following the motivation behind these changes. Why is the filename non-configurable and non-obvious? Why take away the ability to remove arbitrary zones from the current configuration? This change makes this feature irrelevant when dealing with zone counts in the six figure range, as my interest here was in the ability to maintain the configuration and the server state in parallel without a reconfig. I have not yet done any testing of the responsiveness of addzone/delzone vs. reconfig with a full set of zones loaded, so this may be entirely irrelevant anyway, but I'd love to get a better idea of the future direction of this feature before I go any further down this path. Thanks, -Rob ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users