RE: Unable to query the nameserver
You should first verify that you see the packets arriving to ns1.example.de - tcpdump should do the work. Then, enable the query log and ensure that BIND sees the query. Again, the logs are your friends. -Original Message- From: Dotan Cohen [mailto:dotanco...@gmail.com] Sent: Monday, October 04, 2010 11:09 PM To: bind-users@lists.isc.org Subject: Unable to query the nameserver I am configuring BIND on two servers: ns1.example.de on a server with IP address 1.1.1.1 and ns2.example.de on a server with IP address 1.1.2.2. BIND starts fine on both servers, but when I try to configure my domain name in the registrar's control panel I get this error: """ Error : Unable to query the nameserver ns1.example.de """ Of course I have been googling this for hours and I've been reading BIND manuals for about two weeks now! I'm really stuck. Here are my configuration files: // On 1.1.1.1 [r...@1.1.1.1]# cat /etc/named.conf options { directory "/etc"; pid-file "/var/run/named/named.pid"; listen-on { any; }; }; zone "." { type hint; file "/etc/db.cache"; }; zone "example.de" { type master; file "/var/named/example.de.hosts"; notify yes; allow-query { any; }; }; zone "example.eu" { type master; file "/var/named/example.eu.hosts"; }; [r...@1.1.1.1]# cat /var/named/example.de.hosts $ORIGIN example.de. $TTL 86400 example.de. IN SOA example.de. foo.example.de. ( 2010100401; Serial - increment me 10800 3600 604800 38400 ) IN NSns1.example.de. IN NSns2.example.de. IN A 1.1.1.1 wwwIN A 1.1.1.1 ns1IN A 1.1.1.1 ns2IN A 1.1.2.2 // On 1.1.2.2 [r...@1.1.2.2]# cat /etc/named.conf options { directory "/etc"; pid-file "/var/run/named/named.pid"; listen-on { any; }; }; zone "." { type hint; file "/etc/db.cache"; }; zone "example.de" { type slave; masters { 1.1.1.1; }; allow-update { 1.1.1.1; }; file "/var/named/example.de.hosts"; notify yes; allow-query { any; }; allow-notify { 1.1.2.2; }; }; [r...@1.1.2.2]# cat /var/named/example.de.hosts $ORIGIN example.de. $TTL 86400 example.de. IN SOA example.de. foo.example.de. ( 2010100401; Serial - increment me 10800 3600 604800 38400 ) IN NSns2.example.de. ns2IN A 1.1.2.2 Of course, when I make a change to a hosts file I increment the serial number and restart bind. I also restart bind after making a change to named.conf. What am I doing wrong? Thanks! -- Dotan Cohen http://gibberish.co.il http://what-is-what.com ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Bind won't start: /etc/named.conf
What does the logs say? Is the server chrooted or not? And I think you want to use "type slave;" for that zone, if this is a secondary server. -Original Message- From: Dotan Cohen [mailto:dotanco...@gmail.com] Sent: Wednesday, September 29, 2010 12:53 AM To: Imri Zvik Cc: bind-users@lists.isc.org Subject: Re: Bind won't start: /etc/named.conf On Tue, Sep 28, 2010 at 23:49, Imri Zvik wrote: > What are you trying to achieve? An empty named.conf file means named will > use defaults for everything, and will probably just work out-of-the-box (as > a simple resolver) so you should give more information about the goal and > problem (including log entries, troubleshooting data etc.). > The goal is to for the server to be the second name server for a FQDN. This is the relevant zone file: [r...@venus ~]# cat /var/named/example.de.hosts $ORIGIN example.de. $TTL 86400 example.de. IN SOA example.de. foo.example.de. ( 2010092801; Serial - increment me 10800 3600 604800 38400 ) IN NSns2.example.de. ns2IN A x.x.x.168 This is the non-working named.conf that I pieced together from other working file on other servers: [r...@venus ~# cat /etc/named.conf options { directory "/etc"; pid-file "/var/run/named/named.pid"; listen-on { any; }; }; zone "." { type hint; file "/etc/db.cache"; }; zone "example.de" { type master; file "/var/named/example.de.hosts"; }; -- Dotan Cohen http://gibberish.co.il http://what-is-what.com ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: Bind won't start: /etc/named.conf
What are you trying to achieve? An empty named.conf file means named will use defaults for everything, and will probably just work out-of-the-box (as a simple resolver) so you should give more information about the goal and problem (including log entries, troubleshooting data etc.). -Original Message- From: Dotan Cohen [mailto:dotanco...@gmail.com] Sent: Tuesday, September 28, 2010 11:11 PM To: bind-users@lists.isc.org Subject: Bind won't start: /etc/named.conf I have just installed bind on a CentOS 5 machine but it won't start without /etc/named.conf: [r...@venus etc]# /etc/init.d/named start Locating //etc/named.conf failed: [FAILED] [r...@venus etc]# touch /etc/named.conf [r...@venus etc]# /etc/init.d/named start Starting named:[ OK ] Now, a blank named.conf isn't helpful, but I cannot use the named.conf from another server as a template because it references other files (specifically /etc/db.cache). What is the "default" named.conf file for CentOS? I have tried to google for it but have not been able to find something that works. Thanks in advance. -- Dotan Cohen http://gibberish.co.il http://what-is-what.com ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
RE: File Descriptor limit and malfunction bind
Hi :) While I agree with you that 4096 should be sufficient (what is your definition of a highly loaded server?), there are a couple of situations where a server might use more sockets than it would normally use: 1. DOS attack 2. Higher latency while trying to resolve recursion queries. 3. A server with flushed/unprimed cache. I think that the main issue here is why bind freeze when it runs out of sockets. Bottom line, even if there is another, transient, issue which is causing the higher socket usage, raising the limits will at least help avoiding the hang. Regarding epoll - I already mentioned that epoll is the immediate suspect to this and some other issues in 9.4.3 (see my 9.4.3 oddities thread). Please note that I've tried that myself (recompiling with --disable-epoll) on 3.4.3-P*, and ran into this error: 05-Jan-2010 20:54:33.798 general: critical: socket.c:3138: fatal error: 05-Jan-2010 20:54:33.806 general: critical: exiting (due to fatal error in library) Also, my server returned a lot SERVFAIL errors. At the time I was more interested in getting my service back to acceptable levels than debugging/troubleshooting this issue, so I downgraded to 9.4.2, which worked flawlessly. -Original Message- From: JINMEI Tatuya / 神明達哉 [mailto:jin...@isc.org] Sent: Friday, January 08, 2010 8:55 AM To: Imri Zvik Cc: bind-users@lists.isc.org Subject: Re: File Descriptor limit and malfunction bind At Tue, 05 Jan 2010 10:36:27 +0200, Imri Zvik wrote: > > i have a high load DNS server running bind 9.4.3 on RH - > > yesterday we experienced a problem with the bind (the bind froze) , and > > when looking at the logs i saw the following error : > > named error: socket: file descriptor exceeds limit (4096/4096) > > i looked at my OS file descriptor limit and using ulimit -n - 1024 . > > where the number 4096 come from? It's the hard-coded default maximum number of file descriptor (which is nearly equal to the maximum allowable number of open sockets). > If I'm not mistaken, you should either recompile with a higher value for > ISC_SOCKET_MAXSOCKETS or restart named with the -S argument. I'm afraid it's yes and no. Yes, you can raise the hard coded default value by the -S command line option. (I'm afraid) no, I suspect it won't solve the problem. From my past experiences, 4096 should be sufficient even for a very busy server. If it still consumes all available sockets, it's more likely to mean there's some unexpected serious error (bug) which can't be mitigated by raising that limit. I've heard of similar reports (seemingly consuming all available sockets and named "freezes"), but unfortunately I couldn't reproduce it myself and since it seems to be quite rare I've not figured out the problem. One possible workaround one may want to try is to *disable* epoll, the efficient version of I/O API for Linux: ./configure --disable-epoll This means named will use the inefficient API of select, but depending on the machine power and the server load, it may provide acceptable performance and rather stabler behavior as select is (seemingly) stabler API. --- JINMEI, Tatuya Internet Systems Consortium, Inc. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: 9.4.3 oddities
On Wednesday 06 January 2010 12:49:46 Cathy Almond wrote: > That's what I think is possibly happening in your case - one potential > contributing factor being the configuration settings I suggested you > check for. Somewhat obscure - sorry :-/ No need to be sorry - thank you for taking the time to try and help :) Anyhow, I don't define anything with *source* in my configuration, and everything is OK, with the exact same configuration, in 9.4.2-P2... ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: 9.4.3 oddities
On Wednesday 06 January 2010 11:56:13 Cathy Almond wrote: > Do you use any of the following in your configuration: > > transfer-source > transfer-source-v6 > notify-source > notify-source-v6 > query-source > query-source-v6 No :) my configuration is '*source*' free, And anyhow, even if I had it in my configuation, it still doesn't explain the 'rndc reconfig' oddity. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
9.4.3 oddities
Hi, We've recently upgraded our caching servers to 9.4.3-P4/P3 (2 of them running 9.4.3-P4 and 2 running 9.4.3-P3). Few days ago I've noticed something strange - When the server is loaded, some queries randomly fails (SERVFAIL). It seems that only queries for which the answer is NOT cached are affected. I've verified with host/dig and tcpdump that there is no network issue (no unanswered packets). Digging deeper into the issue, I've found that the issue appears when the number of sockets used by named approach 1024~ (checked with netstat/lsof). The weirdest part, is that if I run "rndc reconfig", suddenly named is able to use more than 1024 sockets (I've seen it using 4000-5000~ sockets), and the problem goes away for about an hour. If I downgrade to 3.4.2-P2 the problems goes away. I used the following command to reproduce the problem: for i in {1..10}; do dig mx www.cnn.com @localhost |grep status |grep -v NOERROR; done My servers are running RHEL 5.4 (2.6.18-164.9.1.el5) and FreeBSD 7.0 (the problem is seen on both), and they are splitted into two, unrelated, networks, and on two separate physical locations. I've compiled bind from the vanilla ISC sources using the following configure command: ./configure --enable-threads --enable-largefile --prefix=/usr/local I've also tried the following (I've also raised the OS limits, of course): STD_CDEFINES="-DISC_SOCKET_FDSETSIZE=1048576" ./configure --enable-threads --enable-largefile --prefix=/usr/local As I was seeing the "general: error: socket: file descriptor exceeds limit (4096/4096)" error a couple of days ago. My best guess is that the problem is related to the recent move to epoll... Any ideas on how I should proceed from here? ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: File Descriptor limit and malfunction bind
On Sunday 03 January 2010 16:36:06 Ram Akuka wrote: > i have a high load DNS server running bind 9.4.3 on RH - > yesterday we experienced a problem with the bind (the bind froze) , and > when looking at the logs i saw the following error : > named error: socket: file descriptor exceeds limit (4096/4096) > i looked at my OS file descriptor limit and using ulimit -n - 1024 . > where the number 4096 come from? If I'm not mistaken, you should either recompile with a higher value for ISC_SOCKET_MAXSOCKETS or restart named with the -S argument. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
bind 9.6.1 under perform after running for a couple of hours
Hi, After a couple of hours, performance of bind 9.6.1 suddenly drops. While the server remains responsive, the response time increases, the rate of the failed queries increases, and CPU/load average usage increases. Restarting named solves the problem. I cannot find anything useful in the logs, but a quick search in this mailing list archive shows that other users reported somewhat similar problems with this version of BIND :( The operating system is Linux (Linux ns1 2.6.18-128.el5 #1 SMP Wed Dec 17 11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux) , Red Hat Enterprise Linux Server release 5.3 (Tikanga). Output of named -V: BIND 9.6.1 built with '--enable-threads' '--enable-largefile' '--prefix=/usr/local' /usr/local/sbin/named: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for GNU/Linux 2.6.9, not stripped It is important to state that we just upgraded from 9.4.3-P2. Any ideas? ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users