Re: BIND 9.6.1-P1 crashing

2010-01-08 Thread JINMEI Tatuya / 神明達哉
At Tue, 05 Jan 2010 08:24:16 +0100,
Dario Miculinic  wrote:

> I dont't have the same core dump, but this is from one that happend yesterday:

Thanks, but unfortunately the detailed stack traces don't seem to
provide a useful hint for the race.

If you can help debug this further, could you apply the patch copied
below, rebuild named and run it?  It *may* catch the race condition at
a closer point to the real cause.  (note: this patch only does
diagnose, so it will not fix the problem).

Or, if you need any workaround that *may* work, you may want to
rebuild named with disabling atomic operations.
./configure --disable-atomic [...other options]
I'm not sure if this stops the problem, but I believe it's worth
trying.

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.

Index: heap.c
===
RCS file: /proj/cvs/prod/bind9/lib/isc/heap.c,v
retrieving revision 1.37
diff -u -r1.37 heap.c
--- heap.c  19 Oct 2007 17:15:53 -  1.37
+++ heap.c  8 Jan 2010 08:01:19 -
@@ -149,10 +149,12 @@
 i > 1 && heap->compare(elt, heap->array[p]) ;
 i = p, p = heap_parent(i)) {
heap->array[i] = heap->array[p];
+   INSIST(heap->array[i] != NULL);
if (heap->index != NULL)
(heap->index)(heap->array[i], i);
}
heap->array[i] = elt;
+   INSIST(heap->array[i] != NULL);
if (heap->index != NULL)
(heap->index)(heap->array[i], i);
 
@@ -173,11 +175,13 @@
if (heap->compare(elt, heap->array[j]))
break;
heap->array[i] = heap->array[j];
+   INSIST(heap->array[i] != NULL);
if (heap->index != NULL)
(heap->index)(heap->array[i], i);
i = j;
}
heap->array[i] = elt;
+   INSIST(heap->array[i] != NULL);
if (heap->index != NULL)
(heap->index)(heap->array[i], i);
 
@@ -217,6 +221,7 @@
 
less = heap->compare(elt, heap->array[index]);
heap->array[index] = elt;
+   INSIST(heap->array[index] != NULL);
if (less)
float_up(heap, index, heap->array[index]);
else
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND 9.6.1-P1 crashing

2010-01-04 Thread Dario Miculinic

I dont't have the same core dump, but this is from one that happend yesterday:

#0  0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
752 ttl_sooner(void *v1, void *v2) {
(gdb) where
#0  0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752
#1  0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218
#2  0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0, 
rdataset=0x59375628) at rbtdb.c:1273
#3  0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0, 
top=0x4af6f3e0) at rbtdb.c:1331
#4  0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368, 
least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, 
pruning=isc_boolean_false) at rbtdb.c:1348
#5  0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877
#6  0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173
#7  0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at 
rdataset.c:101
#8  0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081
#9  0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at 
resolver.c:3246
#10 0x081b9221 in run (uap=0xb7f09008) at task.c:862
#11 0x0094c73b in start_thread () from /lib/libpthread.so.0
#12 0x008a1cfe in clone () from /lib/libc.so.6


This is the output of "thread apply all bt full" command (it's quite long):

(gdb) thread apply all bt full

Thread 11 (process 11988):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x007f9367 in sigsuspend () from /lib/libc.so.6
No symbol table info available.
#2  0x081bcc74 in isc_app_run () at app.c:534
event = (isc_event_t *) 0x0
next_event = 
task = (isc_task_t *) 0x0
sset = {__val = {0 }}
strbuf = 
"č\220a\b\000\020\005\000\000đ˙˙\000\000\003\000\030\021ńˇy\000\000\000\002\000\000\000ü\023ňˇ\000\000\000\000\000\000\003\000P \03...@qđˇž\000\000\000\030\000\000\000ô\037\221\000@1\221\000Pqđˇ\bR˝ż\207˝\203\000\021\000\000\000\024\000\000\000`\000\000\000Đěa\by.\000\000Pqđˇ8R˝ż\bŻ\031\bČs\001\000\b_ňˇÔ.\000\000č\220a\b\024\000\000"

#3  0x08059f7c in main (argc=0, argv=0xbfbd53c4) at ./main.c:932
result = 

Thread 10 (process 11989):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
---Type  to continue, or q  to quit---
No symbol table info available.

Thread 9 (process 11990):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 8 (process 11991):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 7 (process 11992):
#0  0x00fe4410 in __kernel_vsyscall ()
---Type  to continue, or q  to quit---
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 6 (process 11993):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 5 (process 11994):
#0  0x00fe4410 in __kernel_vsyscall ()
No symbol table info available.
#1  0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x081b90ae in run (uap=0xb7f09008) at task.c:810
---Type  to continue, or q  to quit---
No locals.
#3  0x0094c73b in start_thread () from /lib/libpthread.so.0
No symbol table info available.
#4  0x008a1cfe in clone () from /lib/libc.so.6
No symbol table info available.

Thread 4 (pr

Re: BIND 9.6.1-P1 crashing

2010-01-04 Thread JINMEI Tatuya / 神明達哉
At Wed, 30 Dec 2009 10:23:17 +0100,
Dario Miculinic  wrote:

> I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat 
> Enterprise Linux Server release 5.2. with BIND 
> version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's 
> nothing in log files, but we have core dump 
> file. I found this in the core dump:
> 
> #0  0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
> 752 ttl_sooner(void *v1, void *v2) {
> (gdb) where
> #0  0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752

What's the result of the following gdb command?

(gdb) thread apply all bt full

We've seen crash like this one, but we've not figured out how this
happens.  This is pretty likely an inter-thread race, and it may be
tricky.  According to the v1/v2 values in your stack trace, a full
backtrace with information of other threads may provide more useful
hint.

If you need immediate workaround rather than chasing the bug,
rebuilding named with --disable-atomic may help (we cannot be sure
because we don't yet know how this bug happens in the first place).
This will use locks in a more conservative way and may avoid the
tricky race condition at the cost of lower performance (so if you want
to try that you'll also need to watch the server load).

---
JINMEI, Tatuya
Internet Systems Consortium, Inc.
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


BIND 9.6.1-P1 crashing

2009-12-30 Thread Dario Miculinic

Hello, all.

I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat Enterprise Linux Server release 5.2. with BIND 
version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's nothing in log files, but we have core dump 
file. I found this in the core dump:


#0  0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
752 ttl_sooner(void *v1, void *v2) {
(gdb) where
#0  0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752
#1  0x0819e708 in isc_heap_delete (heap=0xb0f751a8, index=2) at heap.c:218
#2  0x080e039f in free_rdataset (rbtdb=0xb0f70008, mctx=0x86c9e98, 
rdataset=0x3385b628) at rbtdb.c:1273
#3  0x080e04c3 in clean_stale_headers (rbtdb=0xb0f70008, mctx=0x86c9e98, 
top=0x7fa67700) at rbtdb.c:1331
#4  0x080e10c4 in decrement_reference (rbtdb=0xb0f70008, node=0x36c159f0, 
least_serial=0,
nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, 
pruning=isc_boolean_false) at rbtdb.c:1348
#5  0x080ea711 in detachnode (db=0xb0f70008, targetp=0xb4d1f404) at rbtdb.c:4877
#6  0x080ea9b1 in rdataset_disassociate (rdataset=0xb03f22d8) at rbtdb.c:7173
#7  0x0812e55a in dns_rdataset_disassociate (rdataset=0xb03f22d8) at 
rdataset.c:101
#8  0x080c7dfa in msgresetnames (msg=0xb03e1b60, first_section=) at message.c:463
#9  0x080cb3c5 in msgreset (msg=0x0, everything=isc_boolean_false) at 
message.c:545
#10 0x080cbd05 in dns_message_reset (msg=0xb03e1b60, intent=1) at message.c:800
#11 0x0804dbfc in exit_check (client=0xb03fca70) at client.c:639
#12 0x0806007c in query_find (client=0xb03fca70, event=0x0, qtype=1) at 
query.c:4914
#13 0x08063490 in query_resume (task=0xad4f6bd0, event=0x98cbdb8) at 
query.c:3171
#14 0x081b9221 in run (uap=0xb7f2a008) at task.c:862
#15 0x0059645b in pthread_create@@GLIBC_2.1 () from /lib/libpthread.so.0
#16 0x004ee24e in profil_counter () from /lib/libc.so.6


I couldn't get core dump file from Red Hat server, but in every core dump file on CentOS ttl_sooner function is 
mentioned. Does anyone know what this could be and how to fix it?


Thanks in advance.
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Bind 9.6.1-P1 ignoring listen-on directive

2009-09-09 Thread Kevin Darcy
Syntax. The parser is matching on "localhost" before it sees the negated 
elements.


- Kevin

John Center wrote:

Hi,

I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1) 
& I noticed this in the logs:


Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface lo0, 127.0.0.1#53
Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge0, 153.104.92.2#53
Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge0:1, 153.104.92.4#53
Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge1, 10.104.36.20#53


I only wanted named to listen on one interface + the loopback, so I 
added a listen-on statement in named.conf:


acl testnets { 153.104.244.0/24; 153.104.248.0/24; };
options {
directory "/opt/isc/bind/var/db";
allow-query { testnets; };
listen-on { localhost; 153.104.92.2; };
listen-on-v6 { none; };
};
zone "0.0.127.in-addr.arpa" in {
type master;
file "db.127.0.0";
notify no;
};

But, I still have the same log entries when I start named. I then 
modified named.conf to specifically exclude the other interfaces:


listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; };

But, again, I'm still seeing it state that it is listening on the 
excluded interfaces. I tried increasing the debug level, but I didn't 
see any additional info pertaining to this. I know that it is 
listening on the excluded interfaces because I see a queries on the 
10.104.36.20 interface:


Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 
10.104.109.0#1041: query (cache) 'ATF/A/IN' denied
Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 
10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied


Is this a known problem? It's an issue for us because we restrict DNS 
queries to particular interfaces. If it isn't a known bug, I'd be glad 
to help troubleshoot this problem.


Thanks.

-John



___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Bind 9.6.1-P1 ignoring listen-on directive

2009-09-09 Thread John Center
Of course, right after hitting enter on this message, I came across a 
message from last year about localhost mapping to all interfaces, not 
just 127.0.0.1.  I created a "loopback" acl & used it instead that 
worked.  Sorry for the noise.


-John


On 09/09/2009 03:04 PM, John Center wrote:

Hi,

I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1)&
I noticed this in the logs:

Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info]
listening on IPv4 interface lo0, 127.0.0.1#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info]
listening on IPv4 interface bge0, 153.104.92.2#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info]
listening on IPv4 interface bge0:1, 153.104.92.4#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info]
listening on IPv4 interface bge1, 10.104.36.20#53

I only wanted named to listen on one interface + the loopback, so I
added a listen-on statement in named.conf:

acl testnets { 153.104.244.0/24; 153.104.248.0/24; };
options {
  directory "/opt/isc/bind/var/db";
  allow-query { testnets; };
  listen-on { localhost; 153.104.92.2; };
  listen-on-v6 { none; };
};
zone "0.0.127.in-addr.arpa" in {
  type master;
  file "db.127.0.0";
  notify no;
};

But, I still have the same log entries when I start named.  I then
modified named.conf to specifically exclude the other interfaces:

listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; };

But, again, I'm still seeing it state that it is listening on the
excluded interfaces.  I tried increasing the debug level, but I didn't
see any additional info pertaining to this.  I know that it is listening
on the excluded interfaces because I see a queries on the 10.104.36.20
interface:

Sep  9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client
10.104.109.0#1041: query (cache) 'ATF/A/IN' denied
Sep  9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client
10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied

Is this a known problem?  It's an issue for us because we restrict DNS
queries to particular interfaces.  If it isn't a known bug, I'd be glad
to help troubleshoot this problem.

Thanks.

-John


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Bind 9.6.1-P1 ignoring listen-on directive

2009-09-09 Thread John Center

Hi,

I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1) & 
I noticed this in the logs:


Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface lo0, 127.0.0.1#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge0, 153.104.92.2#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge0:1, 153.104.92.4#53
Sep  9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] 
listening on IPv4 interface bge1, 10.104.36.20#53


I only wanted named to listen on one interface + the loopback, so I 
added a listen-on statement in named.conf:


acl testnets { 153.104.244.0/24; 153.104.248.0/24; };
options {
directory "/opt/isc/bind/var/db";
allow-query { testnets; };
listen-on { localhost; 153.104.92.2; };
listen-on-v6 { none; };
};
zone "0.0.127.in-addr.arpa" in {
type master;
file "db.127.0.0";
notify no;
};

But, I still have the same log entries when I start named.  I then 
modified named.conf to specifically exclude the other interfaces:


listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; };

But, again, I'm still seeing it state that it is listening on the 
excluded interfaces.  I tried increasing the debug level, but I didn't 
see any additional info pertaining to this.  I know that it is listening 
on the excluded interfaces because I see a queries on the 10.104.36.20 
interface:


Sep  9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 
10.104.109.0#1041: query (cache) 'ATF/A/IN' denied
Sep  9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 
10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied


Is this a known problem?  It's an issue for us because we restrict DNS 
queries to particular interfaces.  If it isn't a known bug, I'd be glad 
to help troubleshoot this problem.


Thanks.

-John

--
John Center
Villanova University
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: BIND 9.6.1-P1

2009-07-31 Thread Martin.Wismer.

SUN Freeware
http://sunfreeware.com/index.html
With many thank to Steve Christensen.

Does anyone knows if there is any solaris .pkg distribution for BIND 
9.6.1-P1?
 
Im looking to replace old versions as per:

https://www.isc.org/node/474
 
Thank you,

Julian


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

BIND 9.6.1-P1

2009-07-31 Thread ic.nssip


Does anyone knows if there is any solaris .pkg distribution for BIND 9.6.1-P1?

Im looking to replace old versions as per:
https://www.isc.org/node/474 

Thank you,
Julian___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

BIND 9.6.1-P1

2009-07-31 Thread ic.nssip
Does anyone knows if there is any solaris .pkg distribution for BIND 9.6.1-P1?

Im looking to replace old versions as per:
https://www.isc.org/node/474 

Thank you,
Julian___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

ISC BIND 9.6.1-P1 is now available

2009-07-28 Thread Evan Hunt

 BIND 9.6.1-P1 is now available.

BIND 9.6.1-P1 is a SECURITY PATCH for BIND 9.6.1.  It addresses a
denial-of-service bug in which a malformed UPDATE packet caused
named to crash.

Bugs should be reported to bind9-b...@isc.org.

BIND 9.6.1-P1 can be downloaded from:

ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz

PGP signatures of the distribution are at:

ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.sha256.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.sha512.asc

The signatures were generated with the ISC public key, which is
available at https://www.isc.org/about/openpgp

A binary kit for Windows XP, Windows 2003 and Windows 2008 is at:

ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip

PGP signatures of the binary kit are at:

ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.sha256.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.sha512.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.sha256.asc
ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.sha512.asc

Changes since 9.6.1:

2640.   [security]  A specially crafted update packet will cause named
to exit. [RT #2]

-- 
Evan Hunt -- e...@isc.org
Internet Systems Consortium, Inc.
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users