Re: named daemon hangs
Hi, Thank you all for your help. This fix surely made the difference :). echo "1" >/proc/sys/net/core/xfrm_larval_drop Nelson Vale On Mon, May 4, 2009 at 8:18 AM, Adam Tkac wrote: > On Sat, May 02, 2009 at 04:06:18PM +0100, Nelson Vale wrote: > > Hi all, > > > > > > I've been facing a problem in my private network which I was not able to > fix > > yet. > > > > In my gateway (linux debian alike) I have bind 9.5 installed and running, > > and I have one IPSec tunnel to another gateway over the internet. It also > > has configured a forward zone with the name server being the other > gateway > > internal address (accessibly through the IPSec tunnel only). > > > > Recently the other IPSec endpoint was shutdown and, of course, my queries > to > > the forward domain started failling. Nothing strange here... > > > > The real problem is that I suddendly were not able to resolve any other > DNS > > queries, like www.google.com, from inside my network: > > > > "host www.google.com > > ;; connection timed out; no servers could be reached" > > > > I took a look at the named daemon and I see that it does not respond to > > anything as long as the IPSec tunnel is down, but only if it's the other > > endpoint that is down. I've tried stopping my endpoint and this problem > do > > not occur as long as I restart named. I think this happens because as > long > > as my endpoint is up the routes to the other endpoint are set, and named > > trys to querie the forward domain name server. The problem is that the > > queries do not timeout and named hangs there: > > Please check this: > - https://bugzilla.redhat.com/show_bug.cgi?id=427629 > - http://lkml.org/lkml/2007/12/4/260 > - http://lkml.org/lkml/2008/4/17/474 > > $ echo "1" >/proc/sys/net/core/xfrm_larval_drop > > should help you. > > Adam > > -- > Adam Tkac, Red Hat, Inc. > ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: named daemon hangs
On Sat, May 2, 2009 at 9:39 PM, Jonathan Petersson wrote: > Could you please provide a copy of your config, I'm guessing that you > have a general forwarder in place or haven't turned on recursion. The options and the forward zone are as follows: acl internal { 127.0.0.1/8; 192.168.9.0/24; } options { directory "/etc/namedb"; pid-file "/var/run/named.pid"; statistics-file "/var/run/named.stats"; forwarders { x.x.x.x; (ISP DNS server) x.x.x.x; (ISP DNS server) }; forward first; max-transfer-time-in 120; max-transfer-time-out 120; transfer-format many-answers; }; zone "mylan.loc" { type forward; forwarders { 192.168.90.254; }; }; zone "anothernet.no-ip.org" { type master; file "anothernet.no-ip.org"; allow-query { internal; }; allow-transfer { none; }; allow-update { none; }; }; zone "9.168.192.IN-ADDR.ARPA" { type master; file "another.no-ip.org.rev"; allow-query { internal; }; allow-transfer { none; }; allow-update { none; }; }; ..." > > > /Jonathan > > On Sat, May 2, 2009 at 8:06 AM, Nelson Vale > wrote: > > Hi all, > > > > > > I've been facing a problem in my private network which I was not able to > fix > > yet. > > > > In my gateway (linux debian alike) I have bind 9.5 installed and running, > > and I have one IPSec tunnel to another gateway over the internet. It also > > has configured a forward zone with the name server being the other > gateway > > internal address (accessibly through the IPSec tunnel only). > > > > Recently the other IPSec endpoint was shutdown and, of course, my queries > to > > the forward domain started failling. Nothing strange here... > > > > The real problem is that I suddendly were not able to resolve any other > DNS > > queries, like www.google.com, from inside my network: > > > > "host www.google.com > > ;; connection timed out; no servers could be reached" > > > > I took a look at the named daemon and I see that it does not respond to > > anything as long as the IPSec tunnel is down, but only if it's the other > > endpoint that is down. I've tried stopping my endpoint and this problem > do > > not occur as long as I restart named. I think this happens because as > long > > as my endpoint is up the routes to the other endpoint are set, and named > > trys to querie the forward domain name server. The problem is that the > > queries do not timeout and named hangs there: > > > > The configuration I have is: > > > > Bind: BIND 9.5.0-P2 > > IP Address (private): 192.168.9.254 > > Forwarders: ADSL provider (2 forwarders) > > Forward Zone: mylan.loc > > Name Server:192.168.90.254 > > > > > > After it starts if I try to querie one of the forward zone record > > (box.mylan.loc) it displays: > > > > "... > > 02-May-2009 14:22:21.843 socket 0xb7bd5548: dispatch_recv: event > 0xb7be3d28 > > -> task 0xb7b74d18 > > 02-May-2009 14:22:21.844 socket 0xb7bd5548: internal_recv: task > 0xb7b74d18 > > got event 0xb7bd559c > > 02-May-2009 14:22:21.844 socket 0xb7bd5548 192.168.9.2#47869: packet > > received correctly > > 02-May-2009 14:22:21.844 socket 0xb7bd5548: processing cmsg 0xb7bb2120 > > 02-May-2009 14:22:21.844 client 192.168.9.2#47869: UDP request > > 02-May-2009 14:22:21.844 client 192.168.9.2#47869: using view '_default' > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: request is not signed > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: recursion available > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: ns_client_attach: ref > = 1 > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: query (cache) > > 'box.mylan.loc/A/IN' approved > > 02-May-2009 14:22:21.845 client 192.168.9.2#47869: replace > > 02-May-2009 14:22:21.845 clientmgr @0xb7baa608: createclients > > 02-May-2009 14:22:21.846 clientmgr @0xb7baa608: recycle > > 02-May-2009 14:22:21.846 createfetch: box.mylan.loc A > > 02-May-2009 14:22:21.846 fctx 0xb7bae408(box.mylan.loc/A'): create > > 02-May-2009 14:22:21.846 fctx 0xb7bae408(box.mylan.loc/A'): join > > 02-May-2009 14:22:21.846 fetch 0xb7bb4148 (fctx > > 0xb7bae408(box.mylan.l
named daemon hangs
d378: detach: refcount 0 02-May-2009 14:23:46.774 dispatch 0xb7b6d378: got packet: requests 0, buffers 1, recvs 1 02-May-2009 14:23:46.775 dispatch 0xb7b6d378: shutting down; detaching from sock 0xb7b79938, task 0xb7b74d70 02-May-2009 14:23:46.775 socket 0xb7b79938: destroying 02-May-2009 14:23:46.775 dispatchmgr 0xb7bbb168: destroy_mgr_ok: shuttingdown=0, listnonempty=1, epool=10, rpool=0, dpool=10 02-May-2009 14:23:46.775 shutting down 02-May-2009 14:23:46.775 stopping command channel on 127.0.0.1#953 02-May-2009 14:23:46.776 res 0xb7bbe200: shutdown 02-May-2009 14:23:46.776 res 0xb7bbe200: exiting 02-May-2009 14:23:46.776 dns_requestmgr_shutdown: 0xb7b75008 02-May-2009 14:23:46.776 send_shutdown_events: 0xb7b75008 02-May-2009 14:23:46.777 no longer listening on 127.0.0.1#53 02-May-2009 14:23:46.777 clientmgr @0xb7baa3f8: destroy 02-May-2009 14:23:46.777 no longer listening on 192.167.200.254#53 02-May-2009 14:23:46.777 clientmgr @0xb7baa548: destroy ..." If anybody could give me a hand on this I surelly would appreciate it. Nelson Vale ___ bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users