I got a trouble ticket on this too.

From the looks of things, Cisco is using GSSes to load-balance this site. GSSes return SERVFAIL if all of the resources behind the load-balancer are down (which it determines via a heartbeat mechanism). So I think this is a "simple" case of a website (or cluster) going down. It was down earlier today, then up again, as of this writing, it is down again.

DNS doesn't really have a response code of "requested resource not available", so SERVFAIL is Cisco's closest approximation. It has the drawback, however, of often making other sorts of problems appear to be DNS problems. That's just a cross that we DNS admins have to bear...

- Kevin

On 3/1/2011 4:08 PM, Mike Bernhardt wrote:
I should add that tools.cisco.com was resolvable at one time, so either
Cisco's behavior has changed, or our firewall's behavior has changed. We
obviously haven't upgraded our BIND version in a while (9.4.3P3), so I don't
think the problem is BIND.

-----Original Message-----
From: Mike Bernhardt [mailto:bernha...@bart.gov]
Sent: Tuesday, March 01, 2011 12:40 PM
To: bind-users@lists.isc.org
Subject: Help with unresolvable domain (subdomain, actually)

For some reason, we can no longer resolve tools.cisco.com. there are several
clues to the problem but I can't put them together. Here is some dig output.
I know that the time stamps don't all match up below, but the results are
typical:

[root@ns1 ~]# dig +trace -b 148.165.3.10 tools.cisco.com

;<<>>  DiG 9.4.3-P3<<>>  +trace -b 148.165.3.10 tools.cisco.com
;; global options:  printcmd
.                       90550   IN      NS      i.root-servers.net.
.                       90550   IN      NS      h.root-servers.net.
.                       90550   IN      NS      e.root-servers.net.
.                       90550   IN      NS      d.root-servers.net.
.                       90550   IN      NS      j.root-servers.net.
.                       90550   IN      NS      k.root-servers.net.
.                       90550   IN      NS      l.root-servers.net.
.                       90550   IN      NS      g.root-servers.net.
.                       90550   IN      NS      f.root-servers.net.
.                       90550   IN      NS      a.root-servers.net.
.                       90550   IN      NS      m.root-servers.net.
.                       90550   IN      NS      c.root-servers.net.
.                       90550   IN      NS      b.root-servers.net.
;; Received 512 bytes from 148.165.3.10#53(148.165.3.10) in 0 ms

com.                    172800  IN      NS      l.gtld-servers.net.
com.                    172800  IN      NS      e.gtld-servers.net.
com.                    172800  IN      NS      k.gtld-servers.net.
com.                    172800  IN      NS      i.gtld-servers.net.
com.                    172800  IN      NS      m.gtld-servers.net.
com.                    172800  IN      NS      j.gtld-servers.net.
com.                    172800  IN      NS      a.gtld-servers.net.
com.                    172800  IN      NS      g.gtld-servers.net.
com.                    172800  IN      NS      c.gtld-servers.net.
com.                    172800  IN      NS      f.gtld-servers.net.
com.                    172800  IN      NS      b.gtld-servers.net.
com.                    172800  IN      NS      d.gtld-servers.net.
com.                    172800  IN      NS      h.gtld-servers.net.
;; Received 505 bytes from 198.41.0.4#53(a.root-servers.net) in 13 ms

cisco.com.              172800  IN      NS      ns1.cisco.com.
cisco.com.              172800  IN      NS      ns2.cisco.com.
;; Received 101 bytes from 192.54.112.30#53(h.gtld-servers.net) in 154 ms

tools.cisco.com.        86400   IN      NS
rcdn9-14p-dcz05n-gss1.cisco.com.
tools.cisco.com.        86400   IN      NS      rtp5-dmz-gss1.cisco.com.
tools.cisco.com.        86400   IN      NS      sjck-dmz-gss1.cisco.com.
tools.cisco.com.        86400   IN      NS
cax01-bb14-dcz01n-gss1.cisco.com.
;; Received 226 bytes from 64.102.255.44#53(ns2.cisco.com) in 75 ms

;; Received 33 bytes from 72.163.4.28#53(rcdn9-14p-dcz05n-gss1.cisco.com) in
47 ms

Now, focusing in on rtp5-dmz-gss1.cisco.com for further analysis (just
picked it out of the group):
[root@ns1 ~]# dig -b 148.165.3.10 @rtp5-dmz-gss1.cisco.com tools.cisco.com

;<<>>  DiG 9.4.3-P3<<>>  -b 148.165.3.10 @rtp5-dmz-gss1.cisco.com
tools.cisco.com
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 5165
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.               IN      A

;; Query time: 75 msec
;; SERVER: 64.102.246.5#53(64.102.246.5)
;; WHEN: Tue Mar  1 12:22:57 2011
;; MSG SIZE  rcvd: 33


Here is the output of tcpdump on my server, querying the same server via
nslookup elsewhere:
[root@ns1 ~]# tcpdump host -i bond0 64.102.246.5 -n -p -vvv
tcpdump: listening on bond0, link-type EN10MB (Ethernet), capture size 96
bytes
12:14:53.373614 IP (tos 0x0, ttl  64, id 45237, offset 0, flags [none],
proto: UDP (17), length: 61) 148.165.3.10.18673>  64.102.246.5.domain: [bad
udp cksum a78b!]  26095 A? tools.cisco.com. (33)
12:14:53.455684 IP (tos 0x0, ttl  54, id 7623, offset 0, flags [DF], proto:
UDP (17), length: 61) 64.102.246.5.domain>  148.165.3.10.18673: [udp sum ok]
26095 ServFail- q: A? tools.cisco.com. 0/0/0 (33)

Lastly, I see on our firewall log that we have a Checkpoint Smart Defense
log entry due to it's belief that Cisco is sending us a malformed query
packet, and it's being dropped. I don't know why they're sending the query
in the first place.
Number:                 2595791
Date:                           1Mar2011
Time:                           12:22:53
Type:                           Log
Action:                         Drop
Service:                        domain-udp (53)
Source Port:            domain-udp
Source:                         rtp5-dmz-gss1.cisco.com
Destination:            ns
Protocol:                       udp
Information:            Packet info: Packet data size: 28
Attack:                         Malformed Packet
Attack Information:     UDP length error


Any ideas as to where the problem lies so I can pursue it further?



_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users





_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Reply via email to