[OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
[Marked off-topic as this now has nothing to do with Tomcat.] From: Steve Kirk [mailto:[EMAIL PROTECTED] Can I ask how sure you felt of what you say here please: Uhhh... how about 'the little pixies told me, and I believe everything they say'? :-) It's from a combination of knowing two folks who used to run an ISP's DNS services to get details of the timeouts, plus a little bit of digging into the format of SOA and A records. In other words: it's my opinion, do not take it as canon! I'll try to explain my reasoning below. I tried to research it but could not get to the bottom of it with any real info from ISPs (the problem is that they seem to do their own thing to various extents). That is *exactly* the problem. In essence, one cannot rely on some aspects of the DNS specification in the real world, as real-world ISPs hack with their software to improve performance for their environment in ways that break the spec. An example: I've taken cache expiry times on a zone down to 5 minutes, several days before I knew I needed to move a service to a new IP address; changed the DNS; and sure enough, some ISPs were still handing out the old address 20 hours later because they weren't respecting the stated expiry times and were substituting their own, and the old service was still getting hits. I have set up roundrobin DNS for an ecommerce site in the past without any complaints from users, and the balance of load between a pair of clustered servers seemed pretty even. Good to know that it can work in the real world. I can make all the theoretical points I want, but the hard data in your statement is probably worth more than the rest of this email. I would expect any DNS server run by an ISP (such as AOL) to receive the zone records from SOA intact, i.e. these major dns servers should know about all rr Ips for a given dns name, and would therefore be able to RR distribute them to lower-tier DNS servers. Your expectation is incorrect, I think - even the large DNS servers make standard requests for A records for the given FQDN, and cache the result. If the result contains a set of IP addresses in a particular order, then that's what is obtained. To my knowledge (my reasoning falls down if this is not the case, so this is the bit to check!) neither the returned A records themselves nor the returned SOA record contain any indication that they should be handed out in a round-robin fashion; and the SOA record would not typically be requested by another server. I would have thought that the level at which DNS servers do not pick up the fact that there is a RR DNS entry is where they do not do a zone transfer from a primary DNS server - they simply act as a client and cache what they get as a response, so they are unaware that there even are more than one IP. Even high-level DNS servers don't do zone transfers unless they're secondaries for the zone. And, even then, the information about whether or not to use round-robin is an option set for the zone, not something that appears in the SOA record for immediate use by the secondary. Also, remember that many zones are configured to refuse zone transfer requests from addresses that are not configured as secondaries. So overall I guess I'm saying I'd be surprised if AOL's DNS servers only cached one entry of a RR set for a DNS name. What are your thoughts? I've revised my position slightly. I think they'll cache the list in a particular order, rather than a single entry; but the ordering of that list will be fixed as they won't know to serve it in round-robin fashion. If you can confirm or challenge that position, I'd be interested! - Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
Thanks Peter, interesting. Your experience of it sounds similar to other experiences I've had when changing from one ISP to another (there seems to be a cutover time of up to 3 days where some 3rd party ISPs clearly still cached and served the old IP for our domain name). It was because of this that I investigated more at the time, but as you say, it's each ISP to their own practice. I would expect any DNS server run by an ISP (such as AOL) to receive the zone records from SOA intact, i.e. these major dns servers should know about all rr Ips for a given dns name, and would therefore be able to RR distribute them to lower-tier DNS servers. Your expectation is incorrect, I think - even the large DNS servers make standard requests for A records for the given FQDN, and cache the result. Yes you're probably right there now I think about it. I think these are referred to as caching servers as opposed to secondary. It's the secondaries that receive the zone transfers. Having said that, I'd have thought that a large ISP such as AOL would have secondaries, (inaccesible by joe public), but would also have caching servers, which are the ones they make public. Since they typically have several caching DNS servers, in theory there is a good chance that each of them will get a different one of the RR Ips from their secondary server, so in theory the RR goal is often achieved? For example I just used DOS nslookup to query my ISPs 2 main dns servers for www.microsoft.com - they each returned a different address, although repeatedly querying each one returns the same answer every time. If I go through a local caching DNS on my LAN, that returns a third address for MS - again, the same one every time. If the result contains a set of IP addresses in a particular order, then that's what is obtained. To my knowledge (my reasoning falls down if this is not the case, so this is the bit to check!) neither the returned A records themselves nor the returned SOA record contain any indication that they should be handed out in a round-robin fashion; and the SOA record would not typically be requested by another server. AFAIK that is correct, the DNS protocol does not say anything about how DNS servers should respond to clients when there are multiple Ips registered in DNS for a host. Likewise if the DNS server only returns one IP all the time, the client protocol provides no way for the client to say give me the next one or give me number 3 or give me them all. So some caching DNS servers will always return the first one in the list, others will order Ips according to their own rule (which meets the spec) but then always serve the first one in that order. And others will cycle through them in turn (which is RR). Basically, it's internal feature of the DNS server to decide how it treats hostnames for which is has more than one IP. Of these 3 basic approaches, the first gives no RR, the second is slightly better, the 3rd is the best. Of course they are all only rudimentary load balancing methods, and of course even the 3rd falls down if ISPs with millions of users happen to cache a single IP for a site, as you say. Someone please correct me if any of this is wrong, as I'd like to understand this area better :) PS this has rekindled my interest so I just googled to refresh my mind on the basics, this seems a useful page that explains what we are talking about above. http://www.onjava.com/pub/a/onjava/2001/09/26/load.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
From: Steve Kirk [mailto:[EMAIL PROTECTED] Thanks Peter, interesting. Internet issues in the large tend to be - you get emergent behaviour that is often unexpected :-). I think these are referred to as caching servers as opposed to secondary. It's the secondaries that receive the zone transfers. Yes. Note that these roles are per-zone; a given DNS server may act as a primary or secondary for some zones, and as a caching server for others. Having said that, I'd have thought that a large ISP such as AOL would have secondaries, (inaccesible by joe public), but would also have caching servers, which are the ones they make public. It would be difficult to persuade those secondaries to be effective - for what zones are they secondaries? Let's say AOL want to act as a secondary for foo.com. How do AOL contact the owners of foo.com in order to request that their secondary server is added to the list of allowed IPs for zone transfers? Other than that, AOL could then make use of those servers as forwarders from their caching servers, I accept. Since they typically have several caching DNS servers, in theory there is a good chance that each of them will get a different one of the RR Ips from their secondary server, so in theory the RR goal is often achieved? Assuming they are independent and not configured to use the same forwarders, yes. You might be surprised how few DNS servers an organisation needs, though - Demon (my home ISP, and not a small one) has two, and could probably get away with one except for redundancy. I've not seen an ISP setup document yet that says to use primary and secondary DNS of ns47.isp.net and ns32.isp.net - they're almost all ns0 and ns1 or ns1 and ns2, indicating that there are probably very few in the organisation. For example I just used DOS nslookup to query my ISPs 2 main dns servers for www.microsoft.com - they each returned a different address, although repeatedly querying each one returns the same answer every time. If I go through a local caching DNS on my LAN, that returns a third address for MS - again, the same one every time. Yup. So anyone using your ISP's DNS servers will get one of two IPs for www.microsoft.com at present, out of the however many they have. Lumpy load balancing in action :-). You likely haven't set up your own caching DNS to forward requests to your ISP's DNS servers; otherwise you'd have had one of the same answers. Basically, it's internal feature of the DNS server to decide how it treats hostnames for which is has more than one IP. Indeed. PS this has rekindled my interest so I just googled to refresh my mind on the basics, this seems a useful page that explains what we are talking about above. http://www.onjava.com/pub/a/onjava/2001/09/26/load.html Yes, that seems like a reaonable summary, although it doesn't really go into the caching effects we're discussing here. - Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
Yup. So anyone using your ISP's DNS servers will get one of two IPs for www.microsoft.com at present, out of the however many they have. Lumpy load balancing in action :-). Yes true, hadn't thought of it like that. Where a site has more Ips for a host than an ISP has DNS servers, this is going to lead to lumpiness. I guess this is one of the key reasons why RR DNS is only ever a poor man's load balancer. OK-ish if you have 2 Ips, gets worse if you have more. You likely haven't set up your own caching DNS to forward requests to your ISP's DNS servers; otherwise you'd have had one of the same answers. Funnily enough I have, and I use Demon too. I think my local DNS has maybe kept an MS entry cached and it's refresh TTL is out of sync with the demon DNS caches. But what you say is right - if I restart that local DNS, it will then get a fresh MS entry from one of the 2 cached at the Demon servers. In fact I just have, and it did. Thanks again, that's clarified a few things I was a bit fuzzy on. Sorry John for the slight off-topic diversion but I hope this diversion on RR DNS might have been of interest to you too. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
From: John MccLain [mailto:[EMAIL PROTECTED] 1) for DNS Request Distribution - I dont understand. The browser sends a URL to the DNS, the DNS responds back with an IP address. But what if at that IP address, you have a web server listening on port 80? The browser talks to that Web server. If Tomcat is at that address also, Tomcat would have to listen on another port. Can the DNS distribute back to the browser the IP Address AND the Tomcat port so the browser connects to Tomcat on a non port 80 port? Only if your original URL uses the name:port notation - there is nothing in this scheme to prevent that. Also, is there a way to setup the DNS to Round Robin or check server load on the servers in the Tomcat cluster so it knows which Tomcat server ip:port to send back. No standard way afaik. Worse, downstream DNS servers may (often do) cache the returned IPs for up to a day despite any cache expiry you put on them. If (say) the AOL DNS servers all get the same IP address in their cache, all your AOL visitors will visit the same IP address. DNS is a very lumpy way of doing load balancing. OR does this whole thing imply that you have an IP for each web server (IIS) IP address yes; IIS depends on whether you want IIS or Tomcat at the business end of the cluster. and each web server is tied to each server in the Tomcat cluster via a jk2 redirector? If you wish to use that architecture, yes. 2)TCP NAT distribution - Does this mean that when the browser connects to the IP address, that that connection is intercepted and the request is distributed to a server in the Tomcat cluster? Yes. If this is the case, then what does the interception? Generically, a router that has this capability. It's that router that also does the NATing. Many mid- to high-end hardware routers and some software routing packages can do this. and how do you configure that thing to use a specific algorithm (server load, Round Robin, etc..) to choose which server to forward the request to? That is router-specific. There is no standard (afaik) for the servers to return load information, so you're stuck with proprietary solutions *or* the router doesn't load-balance. can it forward to an IP:PORT or does it have to forward to an IP That is router-specific. Given that the capability typically exists on mid- to high-end routers, most will also have the capability to change the internal port that is in use. - Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster
Peter, I agree that DNS is a very lumpy way of doing load balancing. But your comments interested me. Can I ask how sure you felt of what you say here please: No standard way afaik. Worse, downstream DNS servers may (often do) cache the returned IPs for up to a day despite any cache expiry you put on them. If (say) the AOL DNS servers all get the same IP address in their cache, all your AOL visitors will visit the same IP address. I'm not for a minute suggesting that it is wrong :) and wouldn't dream of doing so, because I don't know all the facts myself. I tried to research it but could not get to the bottom of it with any real info from ISPs (the problem is that they seem to do their own thing to various extents). I'm just interested in comparing experiences/opinions. I have set up roundrobin DNS for an ecommerce site in the past without any complaints from users, and the balance of load between a pair of clustered servers seemed pretty even. I would expect any DNS server run by an ISP (such as AOL) to receive the zone records from SOA intact, i.e. these major dns servers should know about all rr Ips for a given dns name, and would therefore be able to RR distribute them to lower-tier DNS servers. I would have thought that the level at which DNS servers do not pick up the fact that there is a RR DNS entry is where they do not do a zone transfer from a primary DNS server - they simply act as a client and cache what they get as a response, so they are unaware that there even are more than one IP. I'm speculating that these minor DNS servers belong to small ISPs, or private companies running their own DNS in-house? So overall I guess I'm saying I'd be surprised if AOL's DNS servers only cached one entry of a RR set for a DNS name. What are your thoughts? -Original Message- From: Peter Crowther [mailto:[EMAIL PROTECTED] Sent: Wednesday 25 May 2005 17:15 To: Tomcat Users List; [EMAIL PROTECTED] Subject: RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster From: John MccLain [mailto:[EMAIL PROTECTED] 1) for DNS Request Distribution - I dont understand. The browser sends a URL to the DNS, the DNS responds back with an IP address. But what if at that IP address, you have a web server listening on port 80? The browser talks to that Web server. If Tomcat is at that address also, Tomcat would have to listen on another port. Can the DNS distribute back to the browser the IP Address AND the Tomcat port so the browser connects to Tomcat on a non port 80 port? Only if your original URL uses the name:port notation - there is nothing in this scheme to prevent that. Also, is there a way to setup the DNS to Round Robin or check server load on the servers in the Tomcat cluster so it knows which Tomcat server ip:port to send back. No standard way afaik. Worse, downstream DNS servers may (often do) cache the returned IPs for up to a day despite any cache expiry you put on them. If (say) the AOL DNS servers all get the same IP address in their cache, all your AOL visitors will visit the same IP address. DNS is a very lumpy way of doing load balancing. OR does this whole thing imply that you have an IP for each web server (IIS) IP address yes; IIS depends on whether you want IIS or Tomcat at the business end of the cluster. and each web server is tied to each server in the Tomcat cluster via a jk2 redirector? If you wish to use that architecture, yes. 2)TCP NAT distribution - Does this mean that when the browser connects to the IP address, that that connection is intercepted and the request is distributed to a server in the Tomcat cluster? Yes. If this is the case, then what does the interception? Generically, a router that has this capability. It's that router that also does the NATing. Many mid- to high-end hardware routers and some software routing packages can do this. and how do you configure that thing to use a specific algorithm (server load, Round Robin, etc..) to choose which server to forward the request to? That is router-specific. There is no standard (afaik) for the servers to return load information, so you're stuck with proprietary solutions *or* the router doesn't load-balance. can it forward to an IP:PORT or does it have to forward to an IP That is router-specific. Given that the capability typically exists on mid- to high-end routers, most will also have the capability to change the internal port that is in use. - Peter - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]