Re: BIND 9.6.1-P1 crashing
At Tue, 05 Jan 2010 08:24:16 +0100, Dario Miculinic wrote: > I dont't have the same core dump, but this is from one that happend yesterday: Thanks, but unfortunately the detailed stack traces don't seem to provide a useful hint for the race. If you can help debug this further, could you apply the patch copied below, rebuild named and run it? It *may* catch the race condition at a closer point to the real cause. (note: this patch only does diagnose, so it will not fix the problem). Or, if you need any workaround that *may* work, you may want to rebuild named with disabling atomic operations. ./configure --disable-atomic [...other options] I'm not sure if this stops the problem, but I believe it's worth trying. --- JINMEI, Tatuya Internet Systems Consortium, Inc. Index: heap.c === RCS file: /proj/cvs/prod/bind9/lib/isc/heap.c,v retrieving revision 1.37 diff -u -r1.37 heap.c --- heap.c 19 Oct 2007 17:15:53 - 1.37 +++ heap.c 8 Jan 2010 08:01:19 - @@ -149,10 +149,12 @@ i > 1 && heap->compare(elt, heap->array[p]) ; i = p, p = heap_parent(i)) { heap->array[i] = heap->array[p]; + INSIST(heap->array[i] != NULL); if (heap->index != NULL) (heap->index)(heap->array[i], i); } heap->array[i] = elt; + INSIST(heap->array[i] != NULL); if (heap->index != NULL) (heap->index)(heap->array[i], i); @@ -173,11 +175,13 @@ if (heap->compare(elt, heap->array[j])) break; heap->array[i] = heap->array[j]; + INSIST(heap->array[i] != NULL); if (heap->index != NULL) (heap->index)(heap->array[i], i); i = j; } heap->array[i] = elt; + INSIST(heap->array[i] != NULL); if (heap->index != NULL) (heap->index)(heap->array[i], i); @@ -217,6 +221,7 @@ less = heap->compare(elt, heap->array[index]); heap->array[index] = elt; + INSIST(heap->array[index] != NULL); if (less) float_up(heap, index, heap->array[index]); else ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.6.1-P1 crashing
I dont't have the same core dump, but this is from one that happend yesterday: #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752 752 ttl_sooner(void *v1, void *v2) { (gdb) where #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x59375628) at rbtdb.c:752 #1 0x0819e708 in isc_heap_delete (heap=0xb0f54068, index=1) at heap.c:218 #2 0x080e039f in free_rdataset (rbtdb=0xb0f4f008, mctx=0x864bea0, rdataset=0x59375628) at rbtdb.c:1273 #3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f4f008, mctx=0x864bea0, top=0x4af6f3e0) at rbtdb.c:1331 #4 0x080e10c4 in decrement_reference (rbtdb=0xb0f4f008, node=0x411b1368, least_serial=0, nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, pruning=isc_boolean_false) at rbtdb.c:1348 #5 0x080ea711 in detachnode (db=0xb0f4f008, targetp=0xb42fe2e4) at rbtdb.c:4877 #6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb05c5a48) at rbtdb.c:7173 #7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb05c5a48) at rdataset.c:101 #8 0x08132f9e in fctx_destroy (fctx=0xb05c5988) at resolver.c:3081 #9 0x0813548e in fctx_doshutdown (task=0xb0f0ea30, event=0xb05c59e0) at resolver.c:3246 #10 0x081b9221 in run (uap=0xb7f09008) at task.c:862 #11 0x0094c73b in start_thread () from /lib/libpthread.so.0 #12 0x008a1cfe in clone () from /lib/libc.so.6 This is the output of "thread apply all bt full" command (it's quite long): (gdb) thread apply all bt full Thread 11 (process 11988): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x007f9367 in sigsuspend () from /lib/libc.so.6 No symbol table info available. #2 0x081bcc74 in isc_app_run () at app.c:534 event = (isc_event_t *) 0x0 next_event = task = (isc_task_t *) 0x0 sset = {__val = {0 }} strbuf = "č\220a\b\000\020\005\000\000đ˙˙\000\000\003\000\030\021ńˇy\000\000\000\002\000\000\000ü\023ňˇ\000\000\000\000\000\000\003\000P \03...@qđˇž\000\000\000\030\000\000\000ô\037\221\000@1\221\000Pqđˇ\bR˝ż\207˝\203\000\021\000\000\000\024\000\000\000`\000\000\000Đěa\by.\000\000Pqđˇ8R˝ż\bŻ\031\bČs\001\000\b_ňˇÔ.\000\000č\220a\b\024\000\000" #3 0x08059f7c in main (argc=0, argv=0xbfbd53c4) at ./main.c:932 result = Thread 10 (process 11989): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 ---Type to continue, or q to quit--- No symbol table info available. Thread 9 (process 11990): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 No symbol table info available. Thread 8 (process 11991): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 No symbol table info available. Thread 7 (process 11992): #0 0x00fe4410 in __kernel_vsyscall () ---Type to continue, or q to quit--- No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 No symbol table info available. Thread 6 (process 11993): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 No symbol table info available. Thread 5 (process 11994): #0 0x00fe4410 in __kernel_vsyscall () No symbol table info available. #1 0x009509e5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0 No symbol table info available. #2 0x081b90ae in run (uap=0xb7f09008) at task.c:810 ---Type to continue, or q to quit--- No locals. #3 0x0094c73b in start_thread () from /lib/libpthread.so.0 No symbol table info available. #4 0x008a1cfe in clone () from /lib/libc.so.6 No symbol table info available. Thread 4 (pr
Re: BIND 9.6.1-P1 crashing
At Wed, 30 Dec 2009 10:23:17 +0100, Dario Miculinic wrote: > I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat > Enterprise Linux Server release 5.2. with BIND > version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's > nothing in log files, but we have core dump > file. I found this in the core dump: > > #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752 > 752 ttl_sooner(void *v1, void *v2) { > (gdb) where > #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752 What's the result of the following gdb command? (gdb) thread apply all bt full We've seen crash like this one, but we've not figured out how this happens. This is pretty likely an inter-thread race, and it may be tricky. According to the v1/v2 values in your stack trace, a full backtrace with information of other threads may provide more useful hint. If you need immediate workaround rather than chasing the bug, rebuilding named with --disable-atomic may help (we cannot be sure because we don't yet know how this bug happens in the first place). This will use locks in a more conservative way and may avoid the tricky race condition at the cost of lower performance (so if you want to try that you'll also need to watch the server load). --- JINMEI, Tatuya Internet Systems Consortium, Inc. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
BIND 9.6.1-P1 crashing
Hello, all. I'm administrating 4 DNS servers running CentOS release 5.4 and Red Hat Enterprise Linux Server release 5.2. with BIND version 9.6.1-P1. On 3 of them BIND crashed 7 times in last 10 days. There's nothing in log files, but we have core dump file. I found this in the core dump: #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752 752 ttl_sooner(void *v1, void *v2) { (gdb) where #0 0x080db986 in ttl_sooner (v1=0x0, v2=0x3385b628) at rbtdb.c:752 #1 0x0819e708 in isc_heap_delete (heap=0xb0f751a8, index=2) at heap.c:218 #2 0x080e039f in free_rdataset (rbtdb=0xb0f70008, mctx=0x86c9e98, rdataset=0x3385b628) at rbtdb.c:1273 #3 0x080e04c3 in clean_stale_headers (rbtdb=0xb0f70008, mctx=0x86c9e98, top=0x7fa67700) at rbtdb.c:1331 #4 0x080e10c4 in decrement_reference (rbtdb=0xb0f70008, node=0x36c159f0, least_serial=0, nlock=isc_rwlocktype_read, tlock=isc_rwlocktype_none, pruning=isc_boolean_false) at rbtdb.c:1348 #5 0x080ea711 in detachnode (db=0xb0f70008, targetp=0xb4d1f404) at rbtdb.c:4877 #6 0x080ea9b1 in rdataset_disassociate (rdataset=0xb03f22d8) at rbtdb.c:7173 #7 0x0812e55a in dns_rdataset_disassociate (rdataset=0xb03f22d8) at rdataset.c:101 #8 0x080c7dfa in msgresetnames (msg=0xb03e1b60, first_section=) at message.c:463 #9 0x080cb3c5 in msgreset (msg=0x0, everything=isc_boolean_false) at message.c:545 #10 0x080cbd05 in dns_message_reset (msg=0xb03e1b60, intent=1) at message.c:800 #11 0x0804dbfc in exit_check (client=0xb03fca70) at client.c:639 #12 0x0806007c in query_find (client=0xb03fca70, event=0x0, qtype=1) at query.c:4914 #13 0x08063490 in query_resume (task=0xad4f6bd0, event=0x98cbdb8) at query.c:3171 #14 0x081b9221 in run (uap=0xb7f2a008) at task.c:862 #15 0x0059645b in pthread_create@@GLIBC_2.1 () from /lib/libpthread.so.0 #16 0x004ee24e in profil_counter () from /lib/libc.so.6 I couldn't get core dump file from Red Hat server, but in every core dump file on CentOS ttl_sooner function is mentioned. Does anyone know what this could be and how to fix it? Thanks in advance. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind 9.6.1-P1 ignoring listen-on directive
Syntax. The parser is matching on "localhost" before it sees the negated elements. - Kevin John Center wrote: Hi, I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1) & I noticed this in the logs: Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface lo0, 127.0.0.1#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0, 153.104.92.2#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0:1, 153.104.92.4#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge1, 10.104.36.20#53 I only wanted named to listen on one interface + the loopback, so I added a listen-on statement in named.conf: acl testnets { 153.104.244.0/24; 153.104.248.0/24; }; options { directory "/opt/isc/bind/var/db"; allow-query { testnets; }; listen-on { localhost; 153.104.92.2; }; listen-on-v6 { none; }; }; zone "0.0.127.in-addr.arpa" in { type master; file "db.127.0.0"; notify no; }; But, I still have the same log entries when I start named. I then modified named.conf to specifically exclude the other interfaces: listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; }; But, again, I'm still seeing it state that it is listening on the excluded interfaces. I tried increasing the debug level, but I didn't see any additional info pertaining to this. I know that it is listening on the excluded interfaces because I see a queries on the 10.104.36.20 interface: Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1041: query (cache) 'ATF/A/IN' denied Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied Is this a known problem? It's an issue for us because we restrict DNS queries to particular interfaces. If it isn't a known bug, I'd be glad to help troubleshoot this problem. Thanks. -John ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: Bind 9.6.1-P1 ignoring listen-on directive
Of course, right after hitting enter on this message, I came across a message from last year about localhost mapping to all interfaces, not just 127.0.0.1. I created a "loopback" acl & used it instead that worked. Sorry for the noise. -John On 09/09/2009 03:04 PM, John Center wrote: Hi, I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1)& I noticed this in the logs: Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface lo0, 127.0.0.1#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0, 153.104.92.2#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0:1, 153.104.92.4#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge1, 10.104.36.20#53 I only wanted named to listen on one interface + the loopback, so I added a listen-on statement in named.conf: acl testnets { 153.104.244.0/24; 153.104.248.0/24; }; options { directory "/opt/isc/bind/var/db"; allow-query { testnets; }; listen-on { localhost; 153.104.92.2; }; listen-on-v6 { none; }; }; zone "0.0.127.in-addr.arpa" in { type master; file "db.127.0.0"; notify no; }; But, I still have the same log entries when I start named. I then modified named.conf to specifically exclude the other interfaces: listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; }; But, again, I'm still seeing it state that it is listening on the excluded interfaces. I tried increasing the debug level, but I didn't see any additional info pertaining to this. I know that it is listening on the excluded interfaces because I see a queries on the 10.104.36.20 interface: Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1041: query (cache) 'ATF/A/IN' denied Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied Is this a known problem? It's an issue for us because we restrict DNS queries to particular interfaces. If it isn't a known bug, I'd be glad to help troubleshoot this problem. Thanks. -John ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Bind 9.6.1-P1 ignoring listen-on directive
Hi, I'm testing Bind 9.6.1-P1 on Solaris 10 SPARC (64bit/Sun Studio 12.1) & I noticed this in the logs: Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface lo0, 127.0.0.1#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0, 153.104.92.2#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge0:1, 153.104.92.4#53 Sep 9 13:15:31 ns3a/ns3a named[23042]: [ID 873579 daemon.info] listening on IPv4 interface bge1, 10.104.36.20#53 I only wanted named to listen on one interface + the loopback, so I added a listen-on statement in named.conf: acl testnets { 153.104.244.0/24; 153.104.248.0/24; }; options { directory "/opt/isc/bind/var/db"; allow-query { testnets; }; listen-on { localhost; 153.104.92.2; }; listen-on-v6 { none; }; }; zone "0.0.127.in-addr.arpa" in { type master; file "db.127.0.0"; notify no; }; But, I still have the same log entries when I start named. I then modified named.conf to specifically exclude the other interfaces: listen-on { localhost; 153.104.92.2; !153.104.92.4; !10.104.36.20; }; But, again, I'm still seeing it state that it is listening on the excluded interfaces. I tried increasing the debug level, but I didn't see any additional info pertaining to this. I know that it is listening on the excluded interfaces because I see a queries on the 10.104.36.20 interface: Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1041: query (cache) 'ATF/A/IN' denied Sep 9 13:09:16 ns3a/ns3a named[22867]: [ID 873579 daemon.info] client 10.104.109.0#1046: query (cache) 'ATP.villanova.edu/A/IN' denied Is this a known problem? It's an issue for us because we restrict DNS queries to particular interfaces. If it isn't a known bug, I'd be glad to help troubleshoot this problem. Thanks. -John -- John Center Villanova University ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: BIND 9.6.1-P1
SUN Freeware http://sunfreeware.com/index.html With many thank to Steve Christensen. Does anyone knows if there is any solaris .pkg distribution for BIND 9.6.1-P1? Im looking to replace old versions as per: https://www.isc.org/node/474 Thank you, Julian ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
BIND 9.6.1-P1
Does anyone knows if there is any solaris .pkg distribution for BIND 9.6.1-P1? Im looking to replace old versions as per: https://www.isc.org/node/474 Thank you, Julian___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
BIND 9.6.1-P1
Does anyone knows if there is any solaris .pkg distribution for BIND 9.6.1-P1? Im looking to replace old versions as per: https://www.isc.org/node/474 Thank you, Julian___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
ISC BIND 9.6.1-P1 is now available
BIND 9.6.1-P1 is now available. BIND 9.6.1-P1 is a SECURITY PATCH for BIND 9.6.1. It addresses a denial-of-service bug in which a malformed UPDATE packet caused named to crash. Bugs should be reported to bind9-b...@isc.org. BIND 9.6.1-P1 can be downloaded from: ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz PGP signatures of the distribution are at: ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.sha256.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/bind-9.6.1-P1.tar.gz.sha512.asc The signatures were generated with the ISC public key, which is available at https://www.isc.org/about/openpgp A binary kit for Windows XP, Windows 2003 and Windows 2008 is at: ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip PGP signatures of the binary kit are at: ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.sha256.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.zip.sha512.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.sha256.asc ftp://ftp.isc.org/isc/bind9/9.6.1-P1/BIND9.6.1-P1.debug.zip.sha512.asc Changes since 9.6.1: 2640. [security] A specially crafted update packet will cause named to exit. [RT #2] -- Evan Hunt -- e...@isc.org Internet Systems Consortium, Inc. ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users