On Mar 2, 2011, at 1:20 PM, Kevin Darcy wrote:
On 3/2/2011 10:34 AM, David Sparro wrote:
On 3/1/2011 5:27 PM, Kevin Darcy wrote:
See my other post. This is designed-in behavior for Cisco GSSes,
since
there is no "service unavailable, try again later" RCODE.
When the question is "what is the ip address of 'foo'" an answer of
"the web server is down" in nonsensical.
Hmmm... matter of perspective I suppose. Load-balancer architecture
sees DNS as just the externally-visible portion of a whole
subsystem. The SERVFAIL, in their view, does not communicate a DNS
problem _per_se_, but a problem with the whole subsystem. It's more
of a "what you're trying to get to is unavailable right now"
message, communicated, in their view, _through_ DNS (as a sort of
conduit), not necessarily _about_ DNS. They don't see it as
specifically meaning "I've got a DNS problem".
But, everyone else *will*.
I'm not saying I agree with this perspective, only that I've dealt
with load-balancer vendors enough (Cisco in particular) to
understand that this is where they're coming from.
Besides, what alternative is there? If the load-balancer returns an
address that it knows to not be working, then it's purposely causing
the client to go into a relatively-slow connection-timeout failure
mode. Is that responsible behavior? If it gives a "normal" response
that is lacking answer information (NODATA, NXDOMAIN), then this
response gets negatively cached, and the negative cache entry may
delay clients from re-trying the resource even after it recovers.
So, what's left? NOTIMP? FORMERR? REFUSED? NOTAUTH? Those aren't any
better than SERVFAIL from a strictly functional perspective, and are
even more misleading and confusing with respect to the real source
of the problem.
A few options:
1: once the LB knows that all back-ends are down, it can continue to
answer with the correct A, but drop the TTL to be much shorter -- this
allows things to recover faster.
2: have the LB itself serve a 'sorry' page -- the ability to serve
static content locally should be simple, but if it not able to do so
it can always return a set of 'sorry' servers optimized for this
purpose.
You shouldn't be breaking both your serving *and* 'sorry' backends
often enough for there to be special handling needed (and, if you are,
you shouldn't make things worse by making other folk waste their time
debugging your problem).
W
- Kevin
_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
--
I had no shoes and wept. Then I met a man who had no feet. So I
said, "Hey man, got any shoes you're not using?"
_______________________________________________
bind-users mailing list
bind-users@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users