Re: Bind 9.7.0-P1 socket: file descriptor exceeds limit / assertion failure

2010-05-03 Thread JINMEI Tatuya / 神明達哉
At Thu, 29 Apr 2010 14:53:44 -0700,
Dale Kiefling  wrote:

> We have a Bind 9.7.0-P1 instance that is throwing the following errors:
> 21-Apr-2010 16:59:00.173 general: error: socket: file descriptor exceeds 
> limit
> (1024/1024)

The fact that the FD limit is 1024 suggests your named uses select
instead of epoll.  As far as I know Linux kernel 2.6 should support
epoll, so your named may have been built with --disable-epoll.  What's
the result of named -V?

> $ uname -a
> Linux ha1.example.com 2.6.18-128.1.10.el5PAE #1 SMP Thu May 7 11:14:31 
> EDT 2009 i686 athlon i386 GNU/Linux

For a busy recursive server that could consume more than 1024 open
sockets, select won't work well anyway.  Even if you increase the FD
limit it's quite likely that the server hits other scalability issues.
So, if your named was built --disable-epoll, I'd suggest you to
rebuild it with enabling epoll (which should be enabled by default on
your Linux system) and try again.

In any case, the assertion failure should be a bug, but right now I
have no idea about how it happened.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Bind 9.7.0-P1 socket: file descriptor exceeds limit / assertion failure

2010-05-01 Thread Ezra Taylor
Dale:

Sorry I emailed you directly.  I"m sending my response to
the group.


Dale:
The limits.conf file will only set the high and low limit when
you log in.  Once you log out, the open file limit will go back to its
default vaule.  Read the man page for limits.conf.  The below issue has
caught all of us.


*Excerpt is below*

 In general, individual limits have priority over group limits, so if
   you impose no limits for admin group, but one of the members in this
   group have a limits line, the user will have its limits set according
   to this line.

   Also, please note that all limit settings are set per login. They are
   not global, nor are they permanent; existing only for the duration of
   the session.


On Fri, Apr 30, 2010 at 7:32 PM, Dale Kiefling wrote:

> Hey Ezra,
> Thanks for the reply.
>
> ulimit -Hn and ulimit -Sn report 8192.
>
> Wasn't sure if limits.conf would help or not.
>
> Dale
>
> On Apr 30, 2010, at 4:18 PM, Ezra Taylor wrote:
>
> Dale:
>
>  The limits.conf file is not going to solve your problem.  Read
> the man page for initscript and inittab.
>
> On Thu, Apr 29, 2010 at 5:53 PM, Dale Kiefling wrote:
>
>> We have a Bind 9.7.0-P1 instance that is throwing the following errors:
>> 21-Apr-2010 16:59:00.173 general: error: socket: file descriptor exceeds
>> limit
>> (1024/1024)
>> 21-Apr-2010 17:00:00.122 general: error: socket: file descriptor exceeds
>> limit
>> (1024/1024)
>> 21-Apr-2010 17:00:00.123 general: error: socket: file descriptor exceeds
>> limit
>> (1024/1024)
>>
>> When we try to increase the socket value we are seeing assertion failures.
>>
>> Restarted named with the option -S 8192:
>> Apr 26 19:20:54 ha1 named[3891]: socket.c:2781:
>> INSIST(!sock->pending_recv) failed, back trace
>> Apr 26 19:20:54 ha1 named[3891]: #0 0x806525b in ??
>> Apr 26 19:20:54 ha1 named[3891]: #1 0x7b4b57 in ??
>> Apr 26 19:20:54 ha1 named[3891]: #2 0x7dfc03 in ??
>> Apr 26 19:20:54 ha1 named[3891]: #3 0x7e16f9 in ??
>> Apr 26 19:20:54 ha1 named[3891]: #4 0x7e1979 in ??
>> Apr 26 19:20:54 ha1 named[3891]: #5 0x7e1be7 in ??
>> Apr 26 19:20:54 ha1 named[3891]: #6 0x61a49b in ??
>> Apr 26 19:20:54 ha1 named[3891]: #7 0x6fd42e in ??
>> Apr 26 19:20:54 ha1 named[3891]: exiting (due to assertion
>> failure)
>>
>> Any advice given the info provided below?  Let me know if I can provide
>> more info.
>>
>> Dale
>>
>>
>> $ dig +short version.bind chaos txt
>> "9.7.0-P1"
>>
>> $ uname -a
>> Linux ha1.example.com 2.6.18-128.1.10.el5PAE #1 SMP Thu May 7 11:14:31
>> EDT 2009 i686 athlon i386 GNU/Linux
>>
>> $ cat /etc/redhat-release
>> CentOS release 5.3 (Final)
>>
>>
>> $ cat /etc/security/limits.conf
>> *   hardnofile  8192
>> *   softnofile  8192
>> ntp -   memlock 32768
>>
>>
>> cat named.conf
>> ...
>> options {
>>directory "/var/opt/named";
>>pid-file  "/etc/named.pid";
>>notify yes;
>>also-notify {
>>};
>>recursion yes;
>>allow-query { any; };
>>//edns-udp-size 512;
>> };
>> ...
>>
>>
>> unlimit -a reports:
>> open files  (-n) 8192
>>
>>
>> recent rndc stats:
>> +++ Statistics Dump +++ (1271794427)
>> ++ Incoming Requests ++
>>   108267159 QUERY
>> 313 NOTIFY
>> ++ Incoming Queries ++
>>91731351 A
>>  314215 NS
>>   10840 SOA
>> 2704323 PTR
>> 4367570 MX
>>  81 TXT
>> 325 X25
>> 9135705 
>>1072 SRV
>>   6 IXFR
>>1453 AXFR
>> 218 ANY
>> ++ Outgoing Queries ++
>> [View: default]
>> 3077427 A
>>5991 NS
>>2113 SOA
>>   44931 PTR
>> 7552045 MX
>>  53 TXT
>>  41 X25
>> 3218008 
>> 426 SRV
>>  18 ANY
>> [View: _bind]
>> [View: _meta]
>> ++ Name Server Statistics ++
>>   108267472 IPv4 requests received
>>3342 requests with EDNS(0) received
>>5600 TCP requests received
>>   108051102 responses sent
>>4972 truncated responses sent
>>3342 responses with EDNS(0) sent
>>98180939 queries resulted in successful answer
>>   101089523 queries resulted in authoritative answer
>> 5075782 queries resulted in non authoritative answer
>>   7 queries resulted in referral answer
>> 3987640 queries resulted in nxrrset
>> 1885481 queries resulted in SERVFAIL
>> 3996719 queries resulted in NXDOMAIN
>> 5660199 queries caused recursion
>>  207266 duplicate queries received
>>7610 queries dropped
>>1456 requested transfers completed
>> ++ Zone Maintenance Statistics ++
>>9833 IPv4 

Re: Bind 9.7.0-P1 socket: file descriptor exceeds limit / assertion failure

2010-04-30 Thread Dale Kiefling

Hey Ezra,
Thanks for the reply.

ulimit -Hn and ulimit -Sn report 8192.

Wasn't sure if limits.conf would help or not.

Dale

On Apr 30, 2010, at 4:18 PM, Ezra Taylor wrote:


Dale:

 The limits.conf file is not going to solve your  
problem.  Read the man page for initscript and inittab.


On Thu, Apr 29, 2010 at 5:53 PM, Dale Kiefling  
 wrote:
We have a Bind 9.7.0-P1 instance that is throwing the following  
errors:
21-Apr-2010 16:59:00.173 general: error: socket: file descriptor  
exceeds limit

(1024/1024)
21-Apr-2010 17:00:00.122 general: error: socket: file descriptor  
exceeds limit

(1024/1024)
21-Apr-2010 17:00:00.123 general: error: socket: file descriptor  
exceeds limit

(1024/1024)

When we try to increase the socket value we are seeing assertion  
failures.


Restarted named with the option -S 8192:
Apr 26 19:20:54 ha1 named[3891]: socket.c:2781:
INSIST(!sock->pending_recv) failed, back trace
Apr 26 19:20:54 ha1 named[3891]: #0 0x806525b in ??
Apr 26 19:20:54 ha1 named[3891]: #1 0x7b4b57 in ??
Apr 26 19:20:54 ha1 named[3891]: #2 0x7dfc03 in ??
Apr 26 19:20:54 ha1 named[3891]: #3 0x7e16f9 in ??
Apr 26 19:20:54 ha1 named[3891]: #4 0x7e1979 in ??
Apr 26 19:20:54 ha1 named[3891]: #5 0x7e1be7 in ??
Apr 26 19:20:54 ha1 named[3891]: #6 0x61a49b in ??
Apr 26 19:20:54 ha1 named[3891]: #7 0x6fd42e in ??
Apr 26 19:20:54 ha1 named[3891]: exiting (due to assertion
failure)

Any advice given the info provided below?  Let me know if I can  
provide more info.


Dale


$ dig +short version.bind chaos txt
"9.7.0-P1"

$ uname -a
Linux ha1.example.com 2.6.18-128.1.10.el5PAE #1 SMP Thu May 7  
11:14:31 EDT 2009 i686 athlon i386 GNU/Linux


$ cat /etc/redhat-release
CentOS release 5.3 (Final)


$ cat /etc/security/limits.conf
*   hardnofile  8192
*   softnofile  8192
ntp -   memlock 32768


cat named.conf
...
options {
   directory "/var/opt/named";
   pid-file  "/etc/named.pid";
   notify yes;
   also-notify {
   };
   recursion yes;
   allow-query { any; };
   //edns-udp-size 512;
};
...


unlimit -a reports:
open files  (-n) 8192


recent rndc stats:
+++ Statistics Dump +++ (1271794427)
++ Incoming Requests ++
  108267159 QUERY
313 NOTIFY
++ Incoming Queries ++
   91731351 A
 314215 NS
  10840 SOA
2704323 PTR
4367570 MX
 81 TXT
325 X25
9135705 
   1072 SRV
  6 IXFR
   1453 AXFR
218 ANY
++ Outgoing Queries ++
[View: default]
3077427 A
   5991 NS
   2113 SOA
  44931 PTR
7552045 MX
 53 TXT
 41 X25
3218008 
426 SRV
 18 ANY
[View: _bind]
[View: _meta]
++ Name Server Statistics ++
  108267472 IPv4 requests received
   3342 requests with EDNS(0) received
   5600 TCP requests received
  108051102 responses sent
   4972 truncated responses sent
   3342 responses with EDNS(0) sent
   98180939 queries resulted in successful answer
  101089523 queries resulted in authoritative answer
5075782 queries resulted in non authoritative answer
  7 queries resulted in referral answer
3987640 queries resulted in nxrrset
1885481 queries resulted in SERVFAIL
3996719 queries resulted in NXDOMAIN
5660199 queries caused recursion
 207266 duplicate queries received
   7610 queries dropped
   1456 requested transfers completed
++ Zone Maintenance Statistics ++
   9833 IPv4 notifies sent
301 IPv4 notifies received
268 notifies rejected
 315214 IPv4 SOA queries sent
  6 IPv4 AXFR requested
 23 IPv4 IXFR requested
 29 transfer requests succeeded
++ Resolver Statistics ++
[Common]
570 mismatch responses received
 151245 failures in opening query sockets
[View: default]
   13714283 IPv4 queries sent
 186770 IPv6 queries sent
   10815900 IPv4 responses received
 31 IPv6 responses received
 123548 NXDOMAIN received
 955379 SERVFAIL received
  33013 FORMERR received
 806336 other errors received
 382773 EDNS(0) query failures
442 truncated responses received
 751147 lame delegations received
4759160 query retries
3103740 query timeouts
 546721 IPv4 NS address fetches
1168510 IPv6 NS address fetches
  80562 IPv4 NS address fetch failed
1158909 IPv6 NS address fetch failed
1527841 queries with RTT < 10ms

Re: Bind 9.7.0-P1 socket: file descriptor exceeds limit / assertion failure

2010-04-30 Thread Ezra Taylor
Dale:

 The limits.conf file is not going to solve your problem.  Read
the man page for initscript and inittab.

On Thu, Apr 29, 2010 at 5:53 PM, Dale Kiefling wrote:

> We have a Bind 9.7.0-P1 instance that is throwing the following errors:
> 21-Apr-2010 16:59:00.173 general: error: socket: file descriptor exceeds
> limit
> (1024/1024)
> 21-Apr-2010 17:00:00.122 general: error: socket: file descriptor exceeds
> limit
> (1024/1024)
> 21-Apr-2010 17:00:00.123 general: error: socket: file descriptor exceeds
> limit
> (1024/1024)
>
> When we try to increase the socket value we are seeing assertion failures.
>
> Restarted named with the option -S 8192:
> Apr 26 19:20:54 ha1 named[3891]: socket.c:2781:
> INSIST(!sock->pending_recv) failed, back trace
> Apr 26 19:20:54 ha1 named[3891]: #0 0x806525b in ??
> Apr 26 19:20:54 ha1 named[3891]: #1 0x7b4b57 in ??
> Apr 26 19:20:54 ha1 named[3891]: #2 0x7dfc03 in ??
> Apr 26 19:20:54 ha1 named[3891]: #3 0x7e16f9 in ??
> Apr 26 19:20:54 ha1 named[3891]: #4 0x7e1979 in ??
> Apr 26 19:20:54 ha1 named[3891]: #5 0x7e1be7 in ??
> Apr 26 19:20:54 ha1 named[3891]: #6 0x61a49b in ??
> Apr 26 19:20:54 ha1 named[3891]: #7 0x6fd42e in ??
> Apr 26 19:20:54 ha1 named[3891]: exiting (due to assertion
> failure)
>
> Any advice given the info provided below?  Let me know if I can provide
> more info.
>
> Dale
>
>
> $ dig +short version.bind chaos txt
> "9.7.0-P1"
>
> $ uname -a
> Linux ha1.example.com 2.6.18-128.1.10.el5PAE #1 SMP Thu May 7 11:14:31 EDT
> 2009 i686 athlon i386 GNU/Linux
>
> $ cat /etc/redhat-release
> CentOS release 5.3 (Final)
>
>
> $ cat /etc/security/limits.conf
> *   hardnofile  8192
> *   softnofile  8192
> ntp -   memlock 32768
>
>
> cat named.conf
> ...
> options {
>directory "/var/opt/named";
>pid-file  "/etc/named.pid";
>notify yes;
>also-notify {
>};
>recursion yes;
>allow-query { any; };
>//edns-udp-size 512;
> };
> ...
>
>
> unlimit -a reports:
> open files  (-n) 8192
>
>
> recent rndc stats:
> +++ Statistics Dump +++ (1271794427)
> ++ Incoming Requests ++
>   108267159 QUERY
> 313 NOTIFY
> ++ Incoming Queries ++
>91731351 A
>  314215 NS
>   10840 SOA
> 2704323 PTR
> 4367570 MX
>  81 TXT
> 325 X25
> 9135705 
>1072 SRV
>   6 IXFR
>1453 AXFR
> 218 ANY
> ++ Outgoing Queries ++
> [View: default]
> 3077427 A
>5991 NS
>2113 SOA
>   44931 PTR
> 7552045 MX
>  53 TXT
>  41 X25
> 3218008 
> 426 SRV
>  18 ANY
> [View: _bind]
> [View: _meta]
> ++ Name Server Statistics ++
>   108267472 IPv4 requests received
>3342 requests with EDNS(0) received
>5600 TCP requests received
>   108051102 responses sent
>4972 truncated responses sent
>3342 responses with EDNS(0) sent
>98180939 queries resulted in successful answer
>   101089523 queries resulted in authoritative answer
> 5075782 queries resulted in non authoritative answer
>   7 queries resulted in referral answer
> 3987640 queries resulted in nxrrset
> 1885481 queries resulted in SERVFAIL
> 3996719 queries resulted in NXDOMAIN
> 5660199 queries caused recursion
>  207266 duplicate queries received
>7610 queries dropped
>1456 requested transfers completed
> ++ Zone Maintenance Statistics ++
>9833 IPv4 notifies sent
> 301 IPv4 notifies received
> 268 notifies rejected
>  315214 IPv4 SOA queries sent
>   6 IPv4 AXFR requested
>  23 IPv4 IXFR requested
>  29 transfer requests succeeded
> ++ Resolver Statistics ++
> [Common]
> 570 mismatch responses received
>  151245 failures in opening query sockets
> [View: default]
>13714283 IPv4 queries sent
>  186770 IPv6 queries sent
>10815900 IPv4 responses received
>  31 IPv6 responses received
>  123548 NXDOMAIN received
>  955379 SERVFAIL received
>   33013 FORMERR received
>  806336 other errors received
>  382773 EDNS(0) query failures
> 442 truncated responses received
>  751147 lame delegations received
> 4759160 query retries
> 3103740 query timeouts
>  546721 IPv4 NS address fetches
> 1168510 IPv6 NS address fetches
>   80562 IPv4 NS address fetch failed
> 1158909 I

Bind 9.7.0-P1 socket: file descriptor exceeds limit / assertion failure

2010-04-29 Thread Dale Kiefling

We have a Bind 9.7.0-P1 instance that is throwing the following errors:
21-Apr-2010 16:59:00.173 general: error: socket: file descriptor exceeds 
limit

(1024/1024)
21-Apr-2010 17:00:00.122 general: error: socket: file descriptor exceeds 
limit

(1024/1024)
21-Apr-2010 17:00:00.123 general: error: socket: file descriptor exceeds 
limit

(1024/1024)

When we try to increase the socket value we are seeing assertion failures.

Restarted named with the option -S 8192:
Apr 26 19:20:54 ha1 named[3891]: socket.c:2781:
INSIST(!sock->pending_recv) failed, back trace
Apr 26 19:20:54 ha1 named[3891]: #0 0x806525b in ??
Apr 26 19:20:54 ha1 named[3891]: #1 0x7b4b57 in ??
Apr 26 19:20:54 ha1 named[3891]: #2 0x7dfc03 in ??
Apr 26 19:20:54 ha1 named[3891]: #3 0x7e16f9 in ??
Apr 26 19:20:54 ha1 named[3891]: #4 0x7e1979 in ??
Apr 26 19:20:54 ha1 named[3891]: #5 0x7e1be7 in ??
Apr 26 19:20:54 ha1 named[3891]: #6 0x61a49b in ??
Apr 26 19:20:54 ha1 named[3891]: #7 0x6fd42e in ??
Apr 26 19:20:54 ha1 named[3891]: exiting (due to assertion
failure)

Any advice given the info provided below?  Let me know if I can provide 
more info.


Dale


$ dig +short version.bind chaos txt
"9.7.0-P1"

$ uname -a
Linux ha1.example.com 2.6.18-128.1.10.el5PAE #1 SMP Thu May 7 11:14:31 
EDT 2009 i686 athlon i386 GNU/Linux


$ cat /etc/redhat-release
CentOS release 5.3 (Final)


$ cat /etc/security/limits.conf
*   hardnofile  8192
*   softnofile  8192
ntp -   memlock 32768


cat named.conf
...
options {
directory "/var/opt/named";
pid-file  "/etc/named.pid";
notify yes;
also-notify {
};
recursion yes;
allow-query { any; };
//edns-udp-size 512;
};
...


unlimit -a reports:
open files  (-n) 8192


recent rndc stats:
+++ Statistics Dump +++ (1271794427)
++ Incoming Requests ++
   108267159 QUERY
 313 NOTIFY
++ Incoming Queries ++
91731351 A
  314215 NS
   10840 SOA
 2704323 PTR
 4367570 MX
  81 TXT
 325 X25
 9135705 
1072 SRV
   6 IXFR
1453 AXFR
 218 ANY
++ Outgoing Queries ++
[View: default]
 3077427 A
5991 NS
2113 SOA
   44931 PTR
 7552045 MX
  53 TXT
  41 X25
 3218008 
 426 SRV
  18 ANY
[View: _bind]
[View: _meta]
++ Name Server Statistics ++
   108267472 IPv4 requests received
3342 requests with EDNS(0) received
5600 TCP requests received
   108051102 responses sent
4972 truncated responses sent
3342 responses with EDNS(0) sent
98180939 queries resulted in successful answer
   101089523 queries resulted in authoritative answer
 5075782 queries resulted in non authoritative answer
   7 queries resulted in referral answer
 3987640 queries resulted in nxrrset
 1885481 queries resulted in SERVFAIL
 3996719 queries resulted in NXDOMAIN
 5660199 queries caused recursion
  207266 duplicate queries received
7610 queries dropped
1456 requested transfers completed
++ Zone Maintenance Statistics ++
9833 IPv4 notifies sent
 301 IPv4 notifies received
 268 notifies rejected
  315214 IPv4 SOA queries sent
   6 IPv4 AXFR requested
  23 IPv4 IXFR requested
  29 transfer requests succeeded
++ Resolver Statistics ++
[Common]
 570 mismatch responses received
  151245 failures in opening query sockets
[View: default]
13714283 IPv4 queries sent
  186770 IPv6 queries sent
10815900 IPv4 responses received
  31 IPv6 responses received
  123548 NXDOMAIN received
  955379 SERVFAIL received
   33013 FORMERR received
  806336 other errors received
  382773 EDNS(0) query failures
 442 truncated responses received
  751147 lame delegations received
 4759160 query retries
 3103740 query timeouts
  546721 IPv4 NS address fetches
 1168510 IPv6 NS address fetches
   80562 IPv4 NS address fetch failed
 1158909 IPv6 NS address fetch failed
 1527841 queries with RTT < 10ms
 4509306 queries with RTT 10-100ms
 3619163 queries with RTT 100-500ms
  518078 queries with RTT 500-800ms
  493598 queries with RTT 800-1600ms
  147945 queries with RTT > 1600ms
[View: _bind]
[View: _meta]
++ Cache DB RRsets ++
[View: default