Re: NS failover as opposed to A record failover
Thank you for the feedback, Tony. I think a better understanding of what's going on under the hood will prove useful in both designing my operational management strategy as well as just talking me down off the ledge. :) Much obliged. :) Scott From: Tony Finch Sent: February 26, 2020 10:05 AM To: Scott A. Wozny Cc: bind-users@lists.isc.org Subject: Re: NS failover as opposed to A record failover Scott A. Wozny wrote: > > Failures aside, I’m worried about creating a bad user experience EVERY > time I need to take a DNS server down for patching. I generally let resolvers handle retry/failover when I'm patching my authoritative servers. Each resolver that encounters an authoritative server that is down will retry on another server within a few seconds, and should send follow-up queries to more responsive auth servers. There are several retries within the libc resolver timeout, so there are multiple opportunities to automatically deal with an outage within a reasonable amount of time. So the badness isn't that terrible. (i.e. less than the load time for a web page with megabytes of JavaScript.) I reckon this should be good enough for you, because it's a similar amount of badness that your users will encounter from your DNS UPDATE web server failover setup. If you want something better, on my recursive servers I use keepalived to move the service IP addresses off servers while they are being patched. You can do something similar for auth servers, if you have a little cluster in each location. On your web servers, keepalived and HAproxy is supposed to be a good combination (though I have not tried it myself). For servers that are too far apart for layer 2 failover to work, you'll need to get funky with anycast. Tony. -- f.anthony.n.finchhttp://dotat.at/ Tony Finch's homepage<http://dotat.at/> dotat.at Tony Finch's homepage. my work web page (including stuff about email, especially in Cambridge University) my blog (in which I mostly write about things I'm working on) my Twitter account (to which I feed my link log and a steady diet of retweets) my link log (wildly weird and wonderful) My git repositories on chiark and on github; My git server at work. German Bight: Northwest backing southwest later 4 to 6. Slight or moderate. Showers. Good, occasionally moderate. ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: NS failover as opposed to A record failover
Thanks for the feedback, Bob. This is encouraging news. I think now I need to do some testing to see what works best for my application. Scott From: Bob Harold Sent: February 26, 2020 9:02 AM To: Mark Andrews Cc: Scott A. Wozny ; bind-users@lists.isc.org Subject: Re: NS failover as opposed to A record failover On Tue, Feb 25, 2020 at 6:38 PM Mark Andrews mailto:ma...@isc.org>> wrote: > On 26 Feb 2020, at 09:51, Scott A. Wozny > mailto:sawo...@hotmail.com>> wrote: > > I know this isn’t a question ABOUT BIND, per se, but I think is still a > question bind-users might have an answer to. I’ve seen various failover > questions on the list, but nothing that talks specifically about NS records > (at least nothing in the last decade), so I thought I’d inquire here. > > I’m familiar with round-robin DNS and using multiple A records for the same > name. I also understand that most clients, if the top server on the list > doesn’t respond, will wait ~30 seconds before trying the next address on the > list. This is pretty good, as far as automatic failover goes, but still, > having X% of your users (X being down servers / all A records offered) wait > an extra 30 seconds is not great so I’m going to run a regular health check > on my front facing web servers from each BIND server and, if a server stops > responding, change my zone file and reload until the server starts responding > again, reversing the process. Then X% of my users will only need to wait 30 > seconds until I fix the zone file (TTL will also be about the same frequency > as the health checks so worst case scenario will be 2xTTL for X% of users > having to wait those extra 30 seconds). Overall I’m satisfied with this > balance between complexity and resiliency, particularly considering I can do > record manipulation in advance of planned maintenance and then this problem > only becomes an issue during unexpected outages. There is nothing that requires clients to wait a full thirty seconds before moving onto the next address. In fact Happy Eyeballs (RFC 83050) uses sub-second delays before attempting connections to other addresses. Yes, you can have multiple connection attempts to different servers in flight at the same time and just take the one that connects first while dropping the others. The standard socket API supports this sort of behaviour but most applications doesn’t use it that way. https://users.isc.org/~marka/ has code samples, that pre-dates Happy Eyeballs, that attempts to connect to multiple addresses at once with small delays. Once a connection is established the other attempts are aborted. There are examples using select(), poll() and multiple threads. > This is all well and good until I think about failure or maintenance of the > name servers, themselves. I’ll need to give my registrar my NS IPs for my > domain but they will not be nearly as flexible regarding changes as I am > running my own nameservers (TTL will probably be an hour, at the very least) > which makes maintenance work a MUCH longer process for set-up and tear-down, > if I have to make NS record changes in coordination with my registrar. > However, this made me wonder, is NS failure responded to in the same way as > the failure of an A record? Various Internet randos have indicated some DNS > clients and resolvers will do parallel lookups and take the first response > and others have indicated that the “try the next record” parameter for NS > comms is 5 to 10 seconds rather than 30 and still others claim it’s the same > as A record failover at 30 seconds before trying the next candidate on the > list. Is there a definitive answer to this or, because it’s client related, > are the answers too widely varied to rely upon (which is why the answers on > the Internet are all over the map)? Well you have the stub resolver and the recursive server to talk about. Stub resolvers generally use low second timeouts before moving on to the next address but some just shotgun every recursive server. Recursive servers can do the same thing. The 30 seconds comes from the TCP connect timeout. Where the TCP/IP stack makes multiple connection attempts over that 30 seconds before giving up. DNS, initially, is UDP and the client manages retransmission attempts. Named uses sub second initial timeouts. Most of the world is less that 200ms RTT from any other point though there are exceptions. Mark > Failures aside, I’m worried about creating a bad user experience EVERY time I > need to take a DNS server down for patching. I can’t be the first person to > run into this problem. Is it just something people live with (and shuffle NS > records around all the time) or is NS failover really smoother than A record > failover and I should concentrate on ke
Re: NS failover as opposed to A record failover
Thanks very much for the feedback. I clearly have more research to do. :) Scott From: Mark Andrews Sent: February 25, 2020 6:38 PM To: Scott A. Wozny Cc: bind-users@lists.isc.org Subject: Re: NS failover as opposed to A record failover > On 26 Feb 2020, at 09:51, Scott A. Wozny wrote: > > I know this isn’t a question ABOUT BIND, per se, but I think is still a > question bind-users might have an answer to. I’ve seen various failover > questions on the list, but nothing that talks specifically about NS records > (at least nothing in the last decade), so I thought I’d inquire here. > > I’m familiar with round-robin DNS and using multiple A records for the same > name. I also understand that most clients, if the top server on the list > doesn’t respond, will wait ~30 seconds before trying the next address on the > list. This is pretty good, as far as automatic failover goes, but still, > having X% of your users (X being down servers / all A records offered) wait > an extra 30 seconds is not great so I’m going to run a regular health check > on my front facing web servers from each BIND server and, if a server stops > responding, change my zone file and reload until the server starts responding > again, reversing the process. Then X% of my users will only need to wait 30 > seconds until I fix the zone file (TTL will also be about the same frequency > as the health checks so worst case scenario will be 2xTTL for X% of users > having to wait those extra 30 seconds). Overall I’m satisfied with this > balance between complexity and resiliency, particularly considering I can do > record manipulation in advance of planned maintenance and then this problem > only becomes an issue during unexpected outages. There is nothing that requires clients to wait a full thirty seconds before moving onto the next address. In fact Happy Eyeballs (RFC 83050) uses sub-second delays before attempting connections to other addresses. Yes, you can have multiple connection attempts to different servers in flight at the same time and just take the one that connects first while dropping the others. The standard socket API supports this sort of behaviour but most applications doesn’t use it that way. https://users.isc.org/~marka/ has code samples, that pre-dates Happy Eyeballs, that attempts to connect to multiple addresses at once with small delays. Once a connection is established the other attempts are aborted. There are examples using select(), poll() and multiple threads. > This is all well and good until I think about failure or maintenance of the > name servers, themselves. I’ll need to give my registrar my NS IPs for my > domain but they will not be nearly as flexible regarding changes as I am > running my own nameservers (TTL will probably be an hour, at the very least) > which makes maintenance work a MUCH longer process for set-up and tear-down, > if I have to make NS record changes in coordination with my registrar. > However, this made me wonder, is NS failure responded to in the same way as > the failure of an A record? Various Internet randos have indicated some DNS > clients and resolvers will do parallel lookups and take the first response > and others have indicated that the “try the next record” parameter for NS > comms is 5 to 10 seconds rather than 30 and still others claim it’s the same > as A record failover at 30 seconds before trying the next candidate on the > list. Is there a definitive answer to this or, because it’s client related, > are the answers too widely varied to rely upon (which is why the answers on > the Internet are all over the map)? Well you have the stub resolver and the recursive server to talk about. Stub resolvers generally use low second timeouts before moving on to the next address but some just shotgun every recursive server. Recursive servers can do the same thing. The 30 seconds comes from the TCP connect timeout. Where the TCP/IP stack makes multiple connection attempts over that 30 seconds before giving up. DNS, initially, is UDP and the client manages retransmission attempts. Named uses sub second initial timeouts. Most of the world is less that 200ms RTT from any other point though there are exceptions. Mark > Failures aside, I’m worried about creating a bad user experience EVERY time I > need to take a DNS server down for patching. I can’t be the first person to > run into this problem. Is it just something people live with (and shuffle NS > records around all the time) or is NS failover really smoother than A record > failover and I should concentrate on keeping my A records current in case of > failure OR planned maintenance? > > Any feedback would be greatly appreciated. > > Thanks, > > Scott > > ___
Re: NS failover as opposed to A record failover
Scott A. Wozny wrote: > > Failures aside, I’m worried about creating a bad user experience EVERY > time I need to take a DNS server down for patching. I generally let resolvers handle retry/failover when I'm patching my authoritative servers. Each resolver that encounters an authoritative server that is down will retry on another server within a few seconds, and should send follow-up queries to more responsive auth servers. There are several retries within the libc resolver timeout, so there are multiple opportunities to automatically deal with an outage within a reasonable amount of time. So the badness isn't that terrible. (i.e. less than the load time for a web page with megabytes of JavaScript.) I reckon this should be good enough for you, because it's a similar amount of badness that your users will encounter from your DNS UPDATE web server failover setup. If you want something better, on my recursive servers I use keepalived to move the service IP addresses off servers while they are being patched. You can do something similar for auth servers, if you have a little cluster in each location. On your web servers, keepalived and HAproxy is supposed to be a good combination (though I have not tried it myself). For servers that are too far apart for layer 2 failover to work, you'll need to get funky with anycast. Tony. -- f.anthony.n.finchhttp://dotat.at/ German Bight: Northwest backing southwest later 4 to 6. Slight or moderate. Showers. Good, occasionally moderate.___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users
Re: NS failover as opposed to A record failover
On Tue, Feb 25, 2020 at 6:38 PM Mark Andrews wrote: > > > On 26 Feb 2020, at 09:51, Scott A. Wozny wrote: > > > > I know this isn’t a question ABOUT BIND, per se, but I think is still a > question bind-users might have an answer to. I’ve seen various failover > questions on the list, but nothing that talks specifically about NS records > (at least nothing in the last decade), so I thought I’d inquire here. > > > > I’m familiar with round-robin DNS and using multiple A records for the > same name. I also understand that most clients, if the top server on the > list doesn’t respond, will wait ~30 seconds before trying the next address > on the list. This is pretty good, as far as automatic failover goes, but > still, having X% of your users (X being down servers / all A records > offered) wait an extra 30 seconds is not great so I’m going to run a > regular health check on my front facing web servers from each BIND server > and, if a server stops responding, change my zone file and reload until the > server starts responding again, reversing the process. Then X% of my users > will only need to wait 30 seconds until I fix the zone file (TTL will also > be about the same frequency as the health checks so worst case scenario > will be 2xTTL for X% of users having to wait those extra 30 seconds). > Overall I’m satisfied with this balance between complexity and resiliency, > particularly considering I can do record manipulation in advance of planned > maintenance and then this problem only becomes an issue during unexpected > outages. > > There is nothing that requires clients to wait a full thirty seconds > before moving onto the next address. In fact Happy Eyeballs (RFC 83050) > uses sub-second delays before attempting connections to other addresses. > Yes, you > can have multiple connection attempts to different servers in flight at > the same time and just take the one that > connects first while dropping the others. The standard socket API > supports this sort of behaviour but most applications doesn’t use it that > way. > > https://users.isc.org/~marka/ has code samples, that pre-dates Happy > Eyeballs, that attempts to connect to multiple addresses at once with small > delays. Once a connection is established the other attempts are aborted. > > There are examples using select(), poll() and multiple threads. > > > This is all well and good until I think about failure or maintenance of > the name servers, themselves. I’ll need to give my registrar my NS IPs for > my domain but they will not be nearly as flexible regarding changes as I am > running my own nameservers (TTL will probably be an hour, at the very > least) which makes maintenance work a MUCH longer process for set-up and > tear-down, if I have to make NS record changes in coordination with my > registrar. However, this made me wonder, is NS failure responded to in the > same way as the failure of an A record? Various Internet randos have > indicated some DNS clients and resolvers will do parallel lookups and take > the first response and others have indicated that the “try the next record” > parameter for NS comms is 5 to 10 seconds rather than 30 and still others > claim it’s the same as A record failover at 30 seconds before trying the > next candidate on the list. Is there a definitive answer to this or, > because it’s client related, are the answers too widely varied to rely upon > (which is why the answers on the Internet are all over the map)? > > Well you have the stub resolver and the recursive server to talk about. > > Stub resolvers generally use low second timeouts before moving on to the > next address but some just shotgun every recursive server. > > Recursive servers can do the same thing. > > The 30 seconds comes from the TCP connect timeout. Where the TCP/IP stack > makes multiple connection attempts over that 30 seconds before giving up. > > DNS, initially, is UDP and the client manages retransmission attempts. > Named uses sub second initial timeouts. Most of the world is less that > 200ms RTT from any other point though there are exceptions. > > Mark > > > Failures aside, I’m worried about creating a bad user experience EVERY > time I need to take a DNS server down for patching. I can’t be the first > person to run into this problem. Is it just something people live with (and > shuffle NS records around all the time) or is NS failover really smoother > than A record failover and I should concentrate on keeping my A records > current in case of failure OR planned maintenance? > > > > Any feedback would be greatly appreciated. > > > > Thanks, > > > > Scott > > > > ___ > > Please visit https://lists.isc.org/mailman/listinfo/bind-users to > unsubscribe from this list > > > > bind-users mailing list > > bind-users@lists.isc.org > > https://lists.isc.org/mailman/listinfo/bind-users > > -- > Mark Andrews, ISC > 1 Seymour St., Dundas Valley, NSW 2117, Australia > PHONE: +61 2 9871
Re: NS failover as opposed to A record failover
> On 26 Feb 2020, at 09:51, Scott A. Wozny wrote: > > I know this isn’t a question ABOUT BIND, per se, but I think is still a > question bind-users might have an answer to. I’ve seen various failover > questions on the list, but nothing that talks specifically about NS records > (at least nothing in the last decade), so I thought I’d inquire here. > > I’m familiar with round-robin DNS and using multiple A records for the same > name. I also understand that most clients, if the top server on the list > doesn’t respond, will wait ~30 seconds before trying the next address on the > list. This is pretty good, as far as automatic failover goes, but still, > having X% of your users (X being down servers / all A records offered) wait > an extra 30 seconds is not great so I’m going to run a regular health check > on my front facing web servers from each BIND server and, if a server stops > responding, change my zone file and reload until the server starts responding > again, reversing the process. Then X% of my users will only need to wait 30 > seconds until I fix the zone file (TTL will also be about the same frequency > as the health checks so worst case scenario will be 2xTTL for X% of users > having to wait those extra 30 seconds). Overall I’m satisfied with this > balance between complexity and resiliency, particularly considering I can do > record manipulation in advance of planned maintenance and then this problem > only becomes an issue during unexpected outages. There is nothing that requires clients to wait a full thirty seconds before moving onto the next address. In fact Happy Eyeballs (RFC 83050) uses sub-second delays before attempting connections to other addresses. Yes, you can have multiple connection attempts to different servers in flight at the same time and just take the one that connects first while dropping the others. The standard socket API supports this sort of behaviour but most applications doesn’t use it that way. https://users.isc.org/~marka/ has code samples, that pre-dates Happy Eyeballs, that attempts to connect to multiple addresses at once with small delays. Once a connection is established the other attempts are aborted. There are examples using select(), poll() and multiple threads. > This is all well and good until I think about failure or maintenance of the > name servers, themselves. I’ll need to give my registrar my NS IPs for my > domain but they will not be nearly as flexible regarding changes as I am > running my own nameservers (TTL will probably be an hour, at the very least) > which makes maintenance work a MUCH longer process for set-up and tear-down, > if I have to make NS record changes in coordination with my registrar. > However, this made me wonder, is NS failure responded to in the same way as > the failure of an A record? Various Internet randos have indicated some DNS > clients and resolvers will do parallel lookups and take the first response > and others have indicated that the “try the next record” parameter for NS > comms is 5 to 10 seconds rather than 30 and still others claim it’s the same > as A record failover at 30 seconds before trying the next candidate on the > list. Is there a definitive answer to this or, because it’s client related, > are the answers too widely varied to rely upon (which is why the answers on > the Internet are all over the map)? Well you have the stub resolver and the recursive server to talk about. Stub resolvers generally use low second timeouts before moving on to the next address but some just shotgun every recursive server. Recursive servers can do the same thing. The 30 seconds comes from the TCP connect timeout. Where the TCP/IP stack makes multiple connection attempts over that 30 seconds before giving up. DNS, initially, is UDP and the client manages retransmission attempts. Named uses sub second initial timeouts. Most of the world is less that 200ms RTT from any other point though there are exceptions. Mark > Failures aside, I’m worried about creating a bad user experience EVERY time I > need to take a DNS server down for patching. I can’t be the first person to > run into this problem. Is it just something people live with (and shuffle NS > records around all the time) or is NS failover really smoother than A record > failover and I should concentrate on keeping my A records current in case of > failure OR planned maintenance? > > Any feedback would be greatly appreciated. > > Thanks, > > Scott > > ___ > Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe > from this list > > bind-users mailing list > bind-users@lists.isc.org > https://lists.isc.org/mailman/listinfo/bind-users -- Mark Andrews, ISC 1 Seymour St., Dundas Valley, NSW 2117, Australia PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org ___ Please visit
NS failover as opposed to A record failover
I know this isn’t a question ABOUT BIND, per se, but I think is still a question bind-users might have an answer to. I’ve seen various failover questions on the list, but nothing that talks specifically about NS records (at least nothing in the last decade), so I thought I’d inquire here. I’m familiar with round-robin DNS and using multiple A records for the same name. I also understand that most clients, if the top server on the list doesn’t respond, will wait ~30 seconds before trying the next address on the list. This is pretty good, as far as automatic failover goes, but still, having X% of your users (X being down servers / all A records offered) wait an extra 30 seconds is not great so I’m going to run a regular health check on my front facing web servers from each BIND server and, if a server stops responding, change my zone file and reload until the server starts responding again, reversing the process. Then X% of my users will only need to wait 30 seconds until I fix the zone file (TTL will also be about the same frequency as the health checks so worst case scenario will be 2xTTL for X% of users having to wait those extra 30 seconds). Overall I’m satisfied with this balance between complexity and resiliency, particularly considering I can do record manipulation in advance of planned maintenance and then this problem only becomes an issue during unexpected outages. This is all well and good until I think about failure or maintenance of the name servers, themselves. I’ll need to give my registrar my NS IPs for my domain but they will not be nearly as flexible regarding changes as I am running my own nameservers (TTL will probably be an hour, at the very least) which makes maintenance work a MUCH longer process for set-up and tear-down, if I have to make NS record changes in coordination with my registrar. However, this made me wonder, is NS failure responded to in the same way as the failure of an A record? Various Internet randos have indicated some DNS clients and resolvers will do parallel lookups and take the first response and others have indicated that the “try the next record” parameter for NS comms is 5 to 10 seconds rather than 30 and still others claim it’s the same as A record failover at 30 seconds before trying the next candidate on the list. Is there a definitive answer to this or, because it’s client related, are the answers too widely varied to rely upon (which is why the answers on the Internet are all over the map)? Failures aside, I’m worried about creating a bad user experience EVERY time I need to take a DNS server down for patching. I can’t be the first person to run into this problem. Is it just something people live with (and shuffle NS records around all the time) or is NS failover really smoother than A record failover and I should concentrate on keeping my A records current in case of failure OR planned maintenance? Any feedback would be greatly appreciated. Thanks, Scott ___ Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list bind-users mailing list bind-users@lists.isc.org https://lists.isc.org/mailman/listinfo/bind-users