Re: Help with unresolvable domain (subdomain, actually)

2011-03-04 Thread John Wobus

Then the load balancer should return default records or 0.0.0.0/:: to
indicate the name is good but doesn't currently have a address.
I like that solution, actually. Even if the client doesn't recognize  
it

as a special address, hopefully if it tries to connect to it, the
packet won't make it past the first router or switch hop...

Has anyone proposed this to the load-balancer vendors?


Isn't this just a specific instance of configuring a load balancer's
fallback address?  E.g., when server A and B are both down, give  
address of

server C.  Some load balancers allow configuration of a server D to
be used only if C is down as well.  Address C or D could be configured
to be 0.0.0.0 and configured with no test for up-ness.

(Not that I'm completely happy with 0.0.0.0 or any other address that
local folks could conceivably have figured out some crazy use for.)

John
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread David Sparro



On 3/1/2011 5:27 PM, Kevin Darcy wrote:

See my other post. This is designed-in behavior for Cisco GSSes, since
there is no service unavailable, try again later RCODE.

- Kevin



When the question is what is the ip address of 'foo' an answer of the 
web server is down in nonsensical.


--
Dave
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Warren Kumari


On Mar 1, 2011, at 5:27 PM, Kevin Darcy wrote:

See my other post. This is designed-in behavior for Cisco GSSes,  
since there is no service unavailable, try again later RCODE.


Yes[0].

W

[0]:  there is no service unavailable, try again later RCODE.






   - Kevin

On 3/1/2011 4:25 PM, Mark Andrews wrote:

Ring Cisco and complain that their nameservers are broken for the
zone.

;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 13389
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 204 msec
;; SERVER: 72.163.4.28#53(rcdn9-14p-dcz05n-gss1.cisco.com)
;; WHEN: Wed Mar  2 08:23:59 2011
;; MSG SIZE  rcvd: 33




___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users



--
There are only 10 types of people in this world -- those who  
understand binary arithmetic and those who don't.



___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Kevin Darcy

On 3/2/2011 10:34 AM, David Sparro wrote:



On 3/1/2011 5:27 PM, Kevin Darcy wrote:

See my other post. This is designed-in behavior for Cisco GSSes, since
there is no service unavailable, try again later RCODE.



When the question is what is the ip address of 'foo' an answer of 
the web server is down in nonsensical.


Hmmm... matter of perspective I suppose. Load-balancer architecture sees 
DNS as just the externally-visible portion of a whole subsystem. The 
SERVFAIL, in their view, does not communicate a DNS problem _per_se_, 
but a problem with the whole subsystem. It's more of a what you're 
trying to get to is unavailable right now message, communicated, in 
their view, _through_ DNS (as a sort of conduit), not necessarily 
_about_ DNS. They don't see it as specifically meaning I've got a DNS 
problem.


I'm not saying I agree with this perspective, only that I've dealt with 
load-balancer vendors enough (Cisco in particular) to understand that 
this is where they're coming from.


Besides, what alternative is there? If the load-balancer returns an 
address that it knows to not be working, then it's purposely causing the 
client to go into a relatively-slow connection-timeout failure mode. Is 
that responsible behavior? If it gives a normal response that is 
lacking answer information (NODATA, NXDOMAIN), then this response gets 
negatively cached, and the negative cache entry may delay clients from 
re-trying the resource even after it recovers. So, what's left? NOTIMP? 
FORMERR? REFUSED? NOTAUTH? Those aren't any better than SERVFAIL from a 
strictly functional perspective, and are even more misleading and 
confusing with respect to the real source of the problem.




- Kevin



___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RE: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Mike Bernhardt
What's really strange is that when we attempt a query, be it DIG or an
attempt to browse tools.cisco.com, they send some sort of query back to us
from/to UDP 53. We drop it at the firewall due to some sort of sanity
check so I can't see the contents. This is in addition to the SERVFAIL
message.

Although I get SERVFAIL, Kloth.net does not, even if we DIG the same server:
cax01-bb14-dcz01n-gss1.cisco.com
From Kloth
;  DiG 9.3.2  @cax01-bb14-dcz01n-gss1.cisco.com tools.cisco.com A
 ; (1 server found)
 ;; global options:  printcmd
 ;; Got answer:
 ;; -HEADER- opcode: QUERY, status: NOERROR, id: 41388
 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
 
 ;; QUESTION SECTION:
 ;tools.cisco.com.  IN  A
 
 ;; ANSWER SECTION:
 tools.cisco.com.   20  IN  A   72.163.4.38
 
 ;; Query time: 131 msec
 ;; SERVER: 173.37.144.100#53(173.37.144.100)
 ;; WHEN: Wed Mar  2 19:15:04 2011
 ;; MSG SIZE  rcvd: 49

From Us
[root@ns1 ~]# dig -b 148.165.3.10 @cax01-bb14-dcz01n-gss1.cisco.com
tools.cisco.com 

;  DiG 9.4.3-P3  -b 148.165.3.10 @cax01-bb14-dcz01n-gss1.cisco.com
tools.cisco.com
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 26463
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 45 msec
;; SERVER: 173.37.144.100#53(173.37.144.100)
;; WHEN: Wed Mar  2 10:15:31 2011
;; MSG SIZE  rcvd: 33


So I wonder if the query they make is some kind of authentication attempt?


-Original Message-
From: Mark Andrews [mailto:ma...@isc.org] 
Sent: Tuesday, March 01, 2011 3:31 PM
To: Kevin Darcy
Cc: bind-us...@isc.org
Subject: Re: Help with unresolvable domain (subdomain, actually)


In message 4d6d7268.1080...@chrysler.com, Kevin Darcy writes:
 I got a trouble ticket on this too.
 
  From the looks of things, Cisco is using GSSes to load-balance this 
 site. GSSes return SERVFAIL if all of the resources behind the 
 load-balancer are down (which it determines via a heartbeat mechanism). 
 So I think this is a simple case of a website (or cluster) going down. 
 It was down earlier today, then up again, as of this writing, it is down 
 again.
 
 DNS doesn't really have a response code of requested resource not 
 available, so SERVFAIL is Cisco's closest approximation. It has the 
 drawback, however, of often making other sorts of problems appear to be 
 DNS problems. That's just a cross that we DNS admins have to bear...
  
  - Kevin

Then the load balancer should return default records or 0.0.0.0/:: to
indicate the name is good but doesn't currently have a address.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread David Sparro

On 3/2/2011 1:20 PM, Kevin Darcy wrote:


I'm not saying I agree with this perspective, only that I've dealt with
load-balancer vendors enough (Cisco in particular) to understand that
this is where they're coming from.

Besides, what alternative is there? If the load-balancer returns an
address that it knows to not be working, then it's purposely causing the
client to go into a relatively-slow connection-timeout failure mode. Is
that responsible behavior?


Short answer: yes.  The DNS side of the load-balancer has does't know 
why it got the query.  Maybe I was trying to ping the endpoint, I could 
have been trying to make an FTP connection, or HTTPS, etc.  In order for 
it to be consistent, it would have to be able to figure out that a 
SERVFAIL should be returned for the query from  my gopher:// connection, 
but an IP should be returned for http://.



If it gives a normal response that is
lacking answer information (NODATA, NXDOMAIN), then this response gets
negatively cached, and the negative cache entry may delay clients from
re-trying the resource even after it recovers. So, what's left? NOTIMP?
FORMERR? REFUSED? NOTAUTH? Those aren't any better than SERVFAIL from a
strictly functional perspective, and are even more misleading and
confusing with respect to the real source of the problem.


SERVFAIL caching is coming to a BIND server release this year.  (I 
listened to the BIND 9.8 features webinar this morning.  I don't 
remember which version (9.9 or 9.10) had this attached to it on the 
What's Next slide.)


--
Dave
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Warren Kumari


On Mar 2, 2011, at 1:20 PM, Kevin Darcy wrote:


On 3/2/2011 10:34 AM, David Sparro wrote:



On 3/1/2011 5:27 PM, Kevin Darcy wrote:
See my other post. This is designed-in behavior for Cisco GSSes,  
since

there is no service unavailable, try again later RCODE.



When the question is what is the ip address of 'foo' an answer of  
the web server is down in nonsensical.


Hmmm... matter of perspective I suppose. Load-balancer architecture  
sees DNS as just the externally-visible portion of a whole  
subsystem. The SERVFAIL, in their view, does not communicate a DNS  
problem _per_se_, but a problem with the whole subsystem. It's more  
of a what you're trying to get to is unavailable right now  
message, communicated, in their view, _through_ DNS (as a sort of  
conduit), not necessarily _about_ DNS. They don't see it as  
specifically meaning I've got a DNS problem.


But, everyone else *will*.



I'm not saying I agree with this perspective, only that I've dealt  
with load-balancer vendors enough (Cisco in particular) to  
understand that this is where they're coming from.


Besides, what alternative is there? If the load-balancer returns an  
address that it knows to not be working, then it's purposely causing  
the client to go into a relatively-slow connection-timeout failure  
mode. Is that responsible behavior? If it gives a normal response  
that is lacking answer information (NODATA, NXDOMAIN), then this  
response gets negatively cached, and the negative cache entry may  
delay clients from re-trying the resource even after it recovers.
So, what's left? NOTIMP? FORMERR? REFUSED? NOTAUTH? Those aren't any  
better than SERVFAIL from a strictly functional perspective, and are  
even more misleading and confusing with respect to the real source  
of the problem.


A few options:
1: once the LB knows that all back-ends are down, it can continue to  
answer with the correct A, but drop the TTL to be much shorter -- this  
allows things to recover faster.
2: have the LB itself serve a 'sorry' page -- the ability to serve  
static content locally should be simple, but if it not able to do so  
it can always return a set of 'sorry' servers optimized for this  
purpose.


You shouldn't be breaking both your serving *and* 'sorry' backends  
often enough for there to be special handling needed (and, if you are,  
you shouldn't make things worse by making other folk waste their time  
debugging your problem).


W





   - Kevin


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users



--
I had no shoes and wept.  Then I met a man who had no feet.  So I  
said, Hey man, got any shoes you're not using?



___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Kevin Darcy

On 3/1/2011 6:30 PM, Mark Andrews wrote:

In message4d6d7268.1080...@chrysler.com, Kevin Darcy writes:

I got a trouble ticket on this too.

  From the looks of things, Cisco is using GSSes to load-balance this
site. GSSes return SERVFAIL if all of the resources behind the
load-balancer are down (which it determines via a heartbeat mechanism).
So I think this is a simple case of a website (or cluster) going down.
It was down earlier today, then up again, as of this writing, it is down
again.

DNS doesn't really have a response code of requested resource not
available, so SERVFAIL is Cisco's closest approximation. It has the
drawback, however, of often making other sorts of problems appear to be
DNS problems. That's just a cross that we DNS admins have to bear...

  - Kevin

Then the load balancer should return default records or 0.0.0.0/:: to
indicate the name is good but doesn't currently have a address.
I like that solution, actually. Even if the client doesn't recognize it 
as a special address, hopefully if it tries to connect to it, the 
packet won't make it past the first router or switch hop...


Has anyone proposed this to the load-balancer vendors?


- Kevin


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


RE: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Mike Bernhardt
 A few options:
1: once the LB knows that all back-ends are down, it can continue to answer
with the correct A, but drop the TTL to be much shorter -- this allows
things to recover faster.

This would work well because the actually web site wasn't down, at least not
yesterday. If I substituted the IP address for the domain name, it was
reachable and links maintained the domain portion of the URL in dotted
decimal format. It seems only DNS is hosed.

___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-02 Thread Warren Kumari


On Mar 2, 2011, at 1:21 PM, Mike Bernhardt wrote:


What's really strange is that when we attempt a query, be it DIG or an
attempt to browse tools.cisco.com, they send some sort of query back  
to us

from/to UDP 53


Many GSLB solutions attempt to figure out what the best location to  
serve from is by sending a query to the server that just queried  
*them* -- this allows them to figure out latency and decide which  
cluster might be closest
I'm suspecting (although I avoid Cisco LB like the plague and so am  
not sure) that this is the cause.



The other possibility --  I ran tcpdump to see if I could see what the  
query might be I found that I was getting a FormErr response to my  
initial query, causing me to requery without DNSSEC / EDNS0 -- maybe  
you are actually not seeing a query from them, mebe its a FormErr  
response that your FW is noting?


W

wkumari@vimes:~/src/perl/IODEF$ dig +edns=0 tools.cisco.com  
@128.107.227.197


;  DiG 9.7.2-P3  +edns=0 tools.cisco.com @128.107.227.197
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: FORMERR, id: 41568
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 75 msec
;; SERVER: 128.107.227.197#53(128.107.227.197)
;; WHEN: Wed Mar  2 14:17:38 2011
;; MSG SIZE  rcvd: 33

wkumari@vimes:~/src/perl/IODEF$ dig  tools.cisco.com @128.107.227.197

;  DiG 9.7.2-P3  tools.cisco.com @128.107.227.197
;; global options: +cmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 54960
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; ANSWER SECTION:
tools.cisco.com.20  IN  A   173.37.145.8

;; Query time: 75 msec
;; SERVER: 128.107.227.197#53(128.107.227.197)
;; WHEN: Wed Mar  2 14:17:45 2011
;; MSG SIZE  rcvd: 49






. We drop it at the firewall due to some sort of sanity
check so I can't see the contents. This is in addition to the  
SERVFAIL

message.

Although I get SERVFAIL, Kloth.net does not, even if we DIG the same  
server:

cax01-bb14-dcz01n-gss1.cisco.com

From Kloth
;  DiG 9.3.2  @cax01-bb14-dcz01n-gss1.cisco.com  
tools.cisco.com A

; (1 server found)
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: NOERROR, id: 41388
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; ANSWER SECTION:
tools.cisco.com.20  IN  A   72.163.4.38

;; Query time: 131 msec
;; SERVER: 173.37.144.100#53(173.37.144.100)
;; WHEN: Wed Mar  2 19:15:04 2011
;; MSG SIZE  rcvd: 49


From Us

[root@ns1 ~]# dig -b 148.165.3.10 @cax01-bb14-dcz01n-gss1.cisco.com
tools.cisco.com

;  DiG 9.4.3-P3  -b 148.165.3.10 @cax01-bb14-dcz01n- 
gss1.cisco.com

tools.cisco.com
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 26463
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 45 msec
;; SERVER: 173.37.144.100#53(173.37.144.100)
;; WHEN: Wed Mar  2 10:15:31 2011
;; MSG SIZE  rcvd: 33


So I wonder if the query they make is some kind of authentication  
attempt?



-Original Message-
From: Mark Andrews [mailto:ma...@isc.org]
Sent: Tuesday, March 01, 2011 3:31 PM
To: Kevin Darcy
Cc: bind-us...@isc.org
Subject: Re: Help with unresolvable domain (subdomain, actually)


In message 4d6d7268.1080...@chrysler.com, Kevin Darcy writes:

I got a trouble ticket on this too.

From the looks of things, Cisco is using GSSes to load-balance this
site. GSSes return SERVFAIL if all of the resources behind the
load-balancer are down (which it determines via a heartbeat  
mechanism).
So I think this is a simple case of a website (or cluster) going  
down.
It was down earlier today, then up again, as of this writing, it is  
down

again.

DNS doesn't really have a response code of requested resource not
available, so SERVFAIL is Cisco's closest approximation. It has the
drawback, however, of often making other sorts of problems appear  
to be

DNS problems. That's just a cross that we DNS admins have to bear...

- Kevin


Then the load balancer should return default records or 0.0.0.0/:: to
indicate the name is good but doesn't currently have a address.

Mark
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org


___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users



--
Eagles soar but a weasel will never get sucked

Re: Help with unresolvable domain (subdomain, actually)

2011-03-01 Thread Kevin Darcy

I got a trouble ticket on this too.

From the looks of things, Cisco is using GSSes to load-balance this 
site. GSSes return SERVFAIL if all of the resources behind the 
load-balancer are down (which it determines via a heartbeat mechanism). 
So I think this is a simple case of a website (or cluster) going down. 
It was down earlier today, then up again, as of this writing, it is down 
again.


DNS doesn't really have a response code of requested resource not 
available, so SERVFAIL is Cisco's closest approximation. It has the 
drawback, however, of often making other sorts of problems appear to be 
DNS problems. That's just a cross that we DNS admins have to bear...




- Kevin


On 3/1/2011 4:08 PM, Mike Bernhardt wrote:

I should add that tools.cisco.com was resolvable at one time, so either
Cisco's behavior has changed, or our firewall's behavior has changed. We
obviously haven't upgraded our BIND version in a while (9.4.3P3), so I don't
think the problem is BIND.

-Original Message-
From: Mike Bernhardt [mailto:bernha...@bart.gov]
Sent: Tuesday, March 01, 2011 12:40 PM
To: bind-users@lists.isc.org
Subject: Help with unresolvable domain (subdomain, actually)

For some reason, we can no longer resolve tools.cisco.com. there are several
clues to the problem but I can't put them together. Here is some dig output.
I know that the time stamps don't all match up below, but the results are
typical:

[root@ns1 ~]# dig +trace -b 148.165.3.10 tools.cisco.com

;  DiG 9.4.3-P3  +trace -b 148.165.3.10 tools.cisco.com
;; global options:  printcmd
.   90550   IN  NS  i.root-servers.net.
.   90550   IN  NS  h.root-servers.net.
.   90550   IN  NS  e.root-servers.net.
.   90550   IN  NS  d.root-servers.net.
.   90550   IN  NS  j.root-servers.net.
.   90550   IN  NS  k.root-servers.net.
.   90550   IN  NS  l.root-servers.net.
.   90550   IN  NS  g.root-servers.net.
.   90550   IN  NS  f.root-servers.net.
.   90550   IN  NS  a.root-servers.net.
.   90550   IN  NS  m.root-servers.net.
.   90550   IN  NS  c.root-servers.net.
.   90550   IN  NS  b.root-servers.net.
;; Received 512 bytes from 148.165.3.10#53(148.165.3.10) in 0 ms

com.172800  IN  NS  l.gtld-servers.net.
com.172800  IN  NS  e.gtld-servers.net.
com.172800  IN  NS  k.gtld-servers.net.
com.172800  IN  NS  i.gtld-servers.net.
com.172800  IN  NS  m.gtld-servers.net.
com.172800  IN  NS  j.gtld-servers.net.
com.172800  IN  NS  a.gtld-servers.net.
com.172800  IN  NS  g.gtld-servers.net.
com.172800  IN  NS  c.gtld-servers.net.
com.172800  IN  NS  f.gtld-servers.net.
com.172800  IN  NS  b.gtld-servers.net.
com.172800  IN  NS  d.gtld-servers.net.
com.172800  IN  NS  h.gtld-servers.net.
;; Received 505 bytes from 198.41.0.4#53(a.root-servers.net) in 13 ms

cisco.com.  172800  IN  NS  ns1.cisco.com.
cisco.com.  172800  IN  NS  ns2.cisco.com.
;; Received 101 bytes from 192.54.112.30#53(h.gtld-servers.net) in 154 ms

tools.cisco.com.86400   IN  NS
rcdn9-14p-dcz05n-gss1.cisco.com.
tools.cisco.com.86400   IN  NS  rtp5-dmz-gss1.cisco.com.
tools.cisco.com.86400   IN  NS  sjck-dmz-gss1.cisco.com.
tools.cisco.com.86400   IN  NS
cax01-bb14-dcz01n-gss1.cisco.com.
;; Received 226 bytes from 64.102.255.44#53(ns2.cisco.com) in 75 ms

;; Received 33 bytes from 72.163.4.28#53(rcdn9-14p-dcz05n-gss1.cisco.com) in
47 ms

Now, focusing in on rtp5-dmz-gss1.cisco.com for further analysis (just
picked it out of the group):
[root@ns1 ~]# dig -b 148.165.3.10 @rtp5-dmz-gss1.cisco.com tools.cisco.com

;  DiG 9.4.3-P3  -b 148.165.3.10 @rtp5-dmz-gss1.cisco.com
tools.cisco.com
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 5165
;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 75 msec
;; SERVER: 64.102.246.5#53(64.102.246.5)
;; WHEN: Tue Mar  1 12:22:57 2011
;; MSG SIZE  rcvd: 33


Here

Re: Help with unresolvable domain (subdomain, actually)

2011-03-01 Thread Kevin Darcy
See my other post. This is designed-in behavior for Cisco GSSes, since 
there is no service unavailable, try again later RCODE.




- Kevin


On 3/1/2011 4:25 PM, Mark Andrews wrote:

Ring Cisco and complain that their nameservers are broken for the
zone.

;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 13389
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;tools.cisco.com.   IN  A

;; Query time: 204 msec
;; SERVER: 72.163.4.28#53(rcdn9-14p-dcz05n-gss1.cisco.com)
;; WHEN: Wed Mar  2 08:23:59 2011
;; MSG SIZE  rcvd: 33




___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users


Re: Help with unresolvable domain (subdomain, actually)

2011-03-01 Thread Mark Andrews

In message 4d6d7268.1080...@chrysler.com, Kevin Darcy writes:
 I got a trouble ticket on this too.
 
  From the looks of things, Cisco is using GSSes to load-balance this 
 site. GSSes return SERVFAIL if all of the resources behind the 
 load-balancer are down (which it determines via a heartbeat mechanism). 
 So I think this is a simple case of a website (or cluster) going down. 
 It was down earlier today, then up again, as of this writing, it is down 
 again.
 
 DNS doesn't really have a response code of requested resource not 
 available, so SERVFAIL is Cisco's closest approximation. It has the 
 drawback, however, of often making other sorts of problems appear to be 
 DNS problems. That's just a cross that we DNS admins have to bear...
  
  - Kevin

Then the load balancer should return default records or 0.0.0.0/:: to
indicate the name is good but doesn't currently have a address.

Mark
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org
___
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users