RE: Unable to query the nameserver

2010-10-04 Thread Imri Zvik
You should first verify that you see the packets arriving to ns1.example.de
- tcpdump should do the work.
Then, enable the query log and ensure that BIND sees the query.
Again, the logs are your friends.


-Original Message-
From: Dotan Cohen [mailto:dotanco...@gmail.com] 
Sent: Monday, October 04, 2010 11:09 PM
To: bind-users@lists.isc.org
Subject: Unable to query the nameserver

I am configuring BIND on two servers: ns1.example.de on a server with
IP address 1.1.1.1 and ns2.example.de on a server with IP address
1.1.2.2. BIND starts fine on both servers, but when I try to configure
my domain name in the registrar's control panel I get this error:
"""
Error : Unable to query the nameserver ns1.example.de
"""

Of course I have been googling this for hours and I've been reading
BIND manuals for about two weeks now! I'm really stuck. Here are my
configuration files:

// On 1.1.1.1
[r...@1.1.1.1]# cat /etc/named.conf
options {
directory "/etc";
pid-file "/var/run/named/named.pid";
listen-on {
any;
};
};

zone "." {
type hint;
file "/etc/db.cache";
};

zone "example.de" {
type master;
file "/var/named/example.de.hosts";
notify yes;
allow-query { any; };
};
zone "example.eu" {
type master;
file "/var/named/example.eu.hosts";
};
[r...@1.1.1.1]# cat /var/named/example.de.hosts
$ORIGIN example.de.
$TTL 86400
example.de. IN  SOA example.de. foo.example.de. (
2010100401; Serial - increment me
10800
3600
604800
38400 )
   IN  NSns1.example.de.
   IN  NSns2.example.de.
   IN  A 1.1.1.1
wwwIN  A 1.1.1.1
ns1IN  A 1.1.1.1
ns2IN  A 1.1.2.2




// On 1.1.2.2
[r...@1.1.2.2]# cat /etc/named.conf
options {
directory "/etc";
pid-file "/var/run/named/named.pid";
listen-on {
any;
};
};

zone "." {
type hint;
file "/etc/db.cache";
};

zone "example.de" {
type slave;
masters { 1.1.1.1; };
allow-update { 1.1.1.1; };
file "/var/named/example.de.hosts";
notify yes;
allow-query { any; };
allow-notify { 1.1.2.2; };
};
[r...@1.1.2.2]# cat /var/named/example.de.hosts
$ORIGIN example.de.
$TTL 86400
example.de. IN  SOA example.de. foo.example.de. (
2010100401; Serial - increment me
10800
3600
604800
38400 )
   IN  NSns2.example.de.
ns2IN  A 1.1.2.2




Of course, when I make a change to a hosts file I increment the serial
number and restart bind. I also restart bind after making a change to
named.conf. What am I doing wrong? Thanks!

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RE: Bind won't start: /etc/named.conf

2010-10-02 Thread Imri Zvik
What does the logs say?
Is the server chrooted or not?
And I think you want to use "type slave;" for that zone, if this is a secondary 
server.



-Original Message-
From: Dotan Cohen [mailto:dotanco...@gmail.com] 
Sent: Wednesday, September 29, 2010 12:53 AM
To: Imri Zvik
Cc: bind-users@lists.isc.org
Subject: Re: Bind won't start: /etc/named.conf

On Tue, Sep 28, 2010 at 23:49, Imri Zvik  wrote:
> What are you trying to achieve? An empty named.conf file means named will
> use defaults for everything, and will probably just work out-of-the-box (as
> a simple resolver) so you should give more information about the goal and
> problem (including log entries, troubleshooting data etc.).
>

The goal is to for the server to be the second name server for a FQDN.
This is the relevant zone file:

[r...@venus ~]# cat /var/named/example.de.hosts
$ORIGIN example.de.
$TTL 86400
example.de. IN  SOA example.de. foo.example.de. (
2010092801; Serial - increment me
10800
3600
604800
38400 )
   IN  NSns2.example.de.
ns2IN  A x.x.x.168



This is the non-working named.conf that I pieced together from other
working file on other servers:

[r...@venus ~# cat /etc/named.conf
options {
directory "/etc";
pid-file "/var/run/named/named.pid";
listen-on {
any;
};
};

zone "." {
type hint;
file "/etc/db.cache";
};

zone "example.de" {
type master;
file "/var/named/example.de.hosts";
};


-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RE: Bind won't start: /etc/named.conf

2010-09-28 Thread Imri Zvik
What are you trying to achieve? An empty named.conf file means named will
use defaults for everything, and will probably just work out-of-the-box (as
a simple resolver) so you should give more information about the goal and
problem (including log entries, troubleshooting data etc.).



-Original Message-
From: Dotan Cohen [mailto:dotanco...@gmail.com] 
Sent: Tuesday, September 28, 2010 11:11 PM
To: bind-users@lists.isc.org
Subject: Bind won't start: /etc/named.conf

I have just installed bind on a CentOS 5 machine but it won't start
without /etc/named.conf:

[r...@venus etc]# /etc/init.d/named start
Locating //etc/named.conf failed:
   [FAILED]
[r...@venus etc]# touch /etc/named.conf
[r...@venus etc]# /etc/init.d/named start
Starting named:[  OK  ]

Now, a blank named.conf isn't helpful, but I cannot use the named.conf
from another server as a template because it references other files
(specifically /etc/db.cache). What is the "default" named.conf file
for CentOS? I have tried to google for it but have not been able to
find something that works.

Thanks in advance.

-- 
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RE: File Descriptor limit and malfunction bind

2010-01-08 Thread Imri Zvik
Hi :)

While I agree with you that 4096 should be sufficient (what is your definition 
of a highly loaded server?), there are a couple of situations where a server 
might use more sockets than it would normally use:

1. DOS attack
2. Higher latency while trying to resolve recursion queries.
3. A server with flushed/unprimed cache.

I think that the main issue here is why bind freeze when it runs out of sockets.

Bottom line, even if there is another, transient, issue which is causing the 
higher socket usage, raising the limits will at least help avoiding the hang.

Regarding epoll - I already mentioned that epoll is the immediate suspect to 
this and some other issues in 9.4.3 (see my 9.4.3 oddities thread).

Please note that I've tried that myself (recompiling with --disable-epoll) on 
3.4.3-P*, and ran into this error:
05-Jan-2010 20:54:33.798 general: critical: socket.c:3138: fatal error:
05-Jan-2010 20:54:33.806 general: critical: exiting (due to fatal error in 
library)

Also, my server returned a lot SERVFAIL errors.

At the time I was more interested in getting my service back to acceptable 
levels than debugging/troubleshooting this issue, so I downgraded to 9.4.2, 
which worked flawlessly.



-Original Message-
From: JINMEI Tatuya / 神明達哉 [mailto:jin...@isc.org] 
Sent: Friday, January 08, 2010 8:55 AM
To: Imri Zvik
Cc: bind-users@lists.isc.org
Subject: Re: File Descriptor limit and malfunction bind

At Tue, 05 Jan 2010 10:36:27 +0200,
Imri Zvik  wrote:

> > i have a high load DNS server running bind 9.4.3 on RH -
> > yesterday we experienced a problem with the bind  (the bind froze) , and
> > when looking at the logs i saw the following error :
> > named error: socket: file descriptor exceeds limit (4096/4096)
> > i looked at my OS file descriptor limit and using ulimit -n   - 1024 .
> > where the number 4096 come from?

It's the hard-coded default maximum number of file descriptor (which
is nearly equal to the maximum allowable number of open sockets).

> If I'm not mistaken, you should either recompile with a higher value for 
> ISC_SOCKET_MAXSOCKETS or restart named with the -S  argument.

I'm afraid it's yes and no.  Yes, you can raise the hard coded default
value by the -S command line option.  (I'm afraid) no, I suspect it
won't solve the problem.  From my past experiences, 4096 should be
sufficient even for a very busy server.  If it still consumes all
available sockets, it's more likely to mean there's some unexpected
serious error (bug) which can't be mitigated by raising that limit.

I've heard of similar reports (seemingly consuming all available
sockets and named "freezes"), but unfortunately I couldn't reproduce
it myself and since it seems to be quite rare I've not figured out the
problem.

One possible workaround one may want to try is to *disable* epoll, the
efficient version of I/O API for Linux:
./configure --disable-epoll

This means named will use the inefficient API of select, but depending
on the machine power and the server load, it may provide acceptable
performance and rather stabler behavior as select is (seemingly)
stabler API.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Re: 9.4.3 oddities

2010-01-06 Thread Imri Zvik
On Wednesday 06 January 2010 12:49:46 Cathy Almond wrote:
> That's what I think is possibly happening in your case - one potential
> contributing factor being the configuration settings I suggested you
> check for.  Somewhat obscure - sorry :-/

No need to be sorry - thank you for taking the time to try and help :)

Anyhow, I don't define anything with *source* in my configuration, and 
everything is OK, with the exact same configuration, in 9.4.2-P2...






___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: 9.4.3 oddities

2010-01-06 Thread Imri Zvik
On Wednesday 06 January 2010 11:56:13 Cathy Almond wrote:
> Do you use any of the following in your configuration:
>
> transfer-source
> transfer-source-v6
> notify-source
> notify-source-v6
> query-source
> query-source-v6

No :) my configuration is '*source*' free, And anyhow, even if I had it in my 
configuation, it still doesn't explain the 'rndc reconfig' oddity.


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


9.4.3 oddities

2010-01-05 Thread Imri Zvik
Hi,

We've recently upgraded our caching servers to 9.4.3-P4/P3 (2 of them running 
9.4.3-P4 and 2 running 9.4.3-P3). Few days ago I've noticed something 
strange - When the server is loaded, some queries randomly fails (SERVFAIL). 
It seems that only queries for which the answer is NOT cached are affected.
I've verified with host/dig and tcpdump that there is no network issue (no 
unanswered packets). Digging deeper into the issue, I've found that the issue 
appears when the number of sockets used by named approach 1024~ (checked with 
netstat/lsof). The weirdest part, is that if I run "rndc reconfig", suddenly 
named is able to use more than 1024 sockets (I've seen it using 4000-5000~ 
sockets), and the problem goes away for about an hour.

If I downgrade to 3.4.2-P2 the problems goes away.

I used the following command to reproduce the problem:
for i in {1..10}; do dig mx www.cnn.com @localhost |grep status |grep -v 
NOERROR; done

My servers are running RHEL 5.4 (2.6.18-164.9.1.el5) and FreeBSD 7.0 (the 
problem is seen on both), and they are splitted into two, unrelated, 
networks, and on two separate physical locations.

I've compiled bind from the vanilla ISC sources using the following configure 
command:

./configure --enable-threads --enable-largefile --prefix=/usr/local

I've also tried the following (I've also raised the OS limits, of course):
STD_CDEFINES="-DISC_SOCKET_FDSETSIZE=1048576" ./configure --enable-threads 
--enable-largefile --prefix=/usr/local

As I was seeing the "general: error: socket: file descriptor exceeds limit 
(4096/4096)" error a couple of days ago.

My best guess is that the problem is related to the recent move to epoll...

Any ideas on how I should proceed from here? 
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: File Descriptor limit and malfunction bind

2010-01-05 Thread Imri Zvik
On Sunday 03 January 2010 16:36:06 Ram Akuka wrote:
> i have a high load DNS server running bind 9.4.3 on RH -
> yesterday we experienced a problem with the bind  (the bind froze) , and
> when looking at the logs i saw the following error :
> named error: socket: file descriptor exceeds limit (4096/4096)
> i looked at my OS file descriptor limit and using ulimit -n   - 1024 .
> where the number 4096 come from?

If I'm not mistaken, you should either recompile with a higher value for 
ISC_SOCKET_MAXSOCKETS or restart named with the -S  argument.

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

bind 9.6.1 under perform after running for a couple of hours

2009-07-08 Thread Imri Zvik
Hi,

 

After a couple of hours, performance of bind 9.6.1 suddenly drops. While the
server remains responsive, the response time increases, the rate of the
failed queries increases, and CPU/load average usage increases. Restarting
named solves the problem.

 

I cannot find anything useful in the logs, but a quick search in this
mailing list archive shows that other users reported somewhat similar
problems with this version of BIND :(

 

The operating system is Linux (Linux ns1 2.6.18-128.el5 #1 SMP Wed Dec 17
11:41:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux) , Red Hat Enterprise Linux
Server release 5.3 (Tikanga).

 

Output of named -V:

BIND 9.6.1 built with '--enable-threads' '--enable-largefile'
'--prefix=/usr/local'

 

/usr/local/sbin/named: ELF 64-bit LSB executable, AMD x86-64, version 1
(SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), for
GNU/Linux 2.6.9, not stripped

 

It is important to state that we just upgraded from 9.4.3-P2.

 

Any ideas?

 

 

 

 

 

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users