[OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-26 Thread Peter Crowther
[Marked off-topic as this now has nothing to do with Tomcat.]

 From: Steve Kirk [mailto:[EMAIL PROTECTED] 
 Can I ask how sure you felt of what you
 say here please:

Uhhh... how about 'the little pixies told me, and I believe everything
they say'? :-)  It's from a combination of knowing two folks who used to
run an ISP's DNS services to get details of the timeouts, plus a little
bit of digging into the format of SOA and A records.  In other words:
it's my opinion, do not take it as canon!  I'll try to explain my
reasoning below.

 I tried to research it
 but could not get to the bottom of it with any real info from 
 ISPs (the
 problem is that they seem to do their own thing to various 
 extents).

That is *exactly* the problem.  In essence, one cannot rely on some
aspects of the DNS specification in the real world, as real-world ISPs
hack with their software to improve performance for their environment in
ways that break the spec.  An example: I've taken cache expiry times on
a zone down to 5 minutes, several days before I knew I needed to move a
service to a new IP address; changed the DNS; and sure enough, some ISPs
were still handing out the old address 20 hours later because they
weren't respecting the stated expiry times and were substituting their
own, and the old service was still getting hits.

 I have set up roundrobin DNS for an ecommerce site in the 
 past without any
 complaints from users, and the  balance of load between a 
 pair of clustered servers seemed pretty even.

Good to know that it can work in the real world.  I can make all the
theoretical points I want, but the hard data in your statement is
probably worth more than the rest of this email.

 I would expect any DNS server run by an ISP
 (such as AOL) to receive the zone records from SOA intact, i.e. these
 major dns servers should know about all rr Ips for a given 
 dns name, and
 would therefore be able to RR distribute them to lower-tier 
 DNS servers.

Your expectation is incorrect, I think - even the large DNS servers make
standard requests for A records for the given FQDN, and cache the
result.  If the result contains a set of IP addresses in a particular
order, then that's what is obtained.  To my knowledge (my reasoning
falls down if this is not the case, so this is the bit to check!)
neither the returned A records themselves nor the returned SOA record
contain any indication that they should be handed out in a round-robin
fashion; and the SOA record would not typically be requested by another
server.

 I
 would have thought that the level at which DNS servers do not 
 pick up the
 fact that there is a RR DNS entry is where they do not do a 
 zone transfer
 from a primary DNS server - they simply act as a client and 
 cache what they
 get as a response, so they are unaware that there even are 
 more than one IP.

Even high-level DNS servers don't do zone transfers unless they're
secondaries for the zone.  And, even then, the information about whether
or not to use round-robin is an option set for the zone, not something
that appears in the SOA record for immediate use by the secondary.
Also, remember that many zones are configured to refuse zone transfer
requests from addresses that are not configured as secondaries.

 So overall I guess I'm saying I'd be surprised if AOL's DNS 
 servers only
 cached one entry of a RR set for a DNS name.  What are your thoughts?

I've revised my position slightly.  I think they'll cache the list in a
particular order, rather than a single entry; but the ordering of that
list will be fixed as they won't know to serve it in round-robin
fashion.  If you can confirm or challenge that position, I'd be
interested!

- Peter

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-26 Thread Steve Kirk

Thanks Peter, interesting.  Your experience of it sounds similar to other
experiences I've had when changing from one ISP to another (there seems to
be a cutover time of up to 3 days where some 3rd party ISPs clearly still
cached and served the old IP for our domain name).  It was because of this
that I investigated more at the time, but as you say, it's each ISP to their
own practice.

  I would expect any DNS server run by an ISP
  (such as AOL) to receive the zone records from SOA intact, 
  i.e. these
  major dns servers should know about all rr Ips for a given 
  dns name, and
  would therefore be able to RR distribute them to lower-tier 
  DNS servers.
 
 Your expectation is incorrect, I think - even the large DNS 
 servers make
 standard requests for A records for the given FQDN, and cache the
 result.  

Yes you're probably right there now I think about it.  I think these are
referred to as caching servers as opposed to secondary.  It's the
secondaries that receive the zone transfers.

Having said that, I'd have thought that a large ISP such as AOL would have
secondaries, (inaccesible by joe public), but would also have caching
servers, which are the ones they make public.  Since they typically have
several caching DNS servers, in theory there is a good chance that each of
them will get a different one of the RR Ips from their secondary server, so
in theory the RR goal is often achieved?  For example I just used DOS
nslookup to query my ISPs 2 main dns servers for www.microsoft.com - they
each returned a different address, although repeatedly querying each one
returns the same answer every time.  If I go through a local caching DNS on
my LAN, that returns a third address for MS - again, the same one every
time.

 If the result contains a set of IP addresses in a particular
 order, then that's what is obtained.  To my knowledge (my reasoning
 falls down if this is not the case, so this is the bit to check!)
 neither the returned A records themselves nor the returned SOA record
 contain any indication that they should be handed out in a round-robin
 fashion; and the SOA record would not typically be requested 
 by another
 server.

AFAIK that is correct, the DNS protocol does not say anything about how DNS
servers should respond to clients when there are multiple Ips registered in
DNS for a host.  Likewise if the DNS server only returns one IP all the
time, the client protocol provides no way for the client to say give me the
next one or give me number 3 or give me them all.  So some caching DNS
servers will always return the first one in the list, others will order Ips
according to their own rule (which meets the spec) but then always serve the
first one in that order.  And others will cycle through them in turn (which
is RR).  Basically, it's internal feature of the DNS server to decide how
it treats hostnames for which is has more than one IP.  

Of these 3 basic approaches, the first gives no RR, the second is slightly
better, the 3rd is the best.  Of course they are all only rudimentary load
balancing methods, and of course even the 3rd falls down if ISPs with
millions of users happen to cache a single IP for a site, as you say. 

Someone please correct me if any of this is wrong, as I'd like to understand
this area better :)

PS this has rekindled my interest so I just googled to refresh my mind on
the basics, this seems a useful page that explains what we are talking about
above.
http://www.onjava.com/pub/a/onjava/2001/09/26/load.html 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-26 Thread Peter Crowther
 From: Steve Kirk [mailto:[EMAIL PROTECTED] 
 Thanks Peter, interesting.

Internet issues in the large tend to be - you get emergent behaviour
that is often unexpected :-).

 I think these are
 referred to as caching servers as opposed to secondary.  It's the
 secondaries that receive the zone transfers.

Yes.  Note that these roles are per-zone; a given DNS server may act as
a primary or secondary for some zones, and as a caching server for
others.

 Having said that, I'd have thought that a large ISP such as 
 AOL would have
 secondaries, (inaccesible by joe public), but would also have caching
 servers, which are the ones they make public.

It would be difficult to persuade those secondaries to be effective -
for what zones are they secondaries?  Let's say AOL want to act as a
secondary for foo.com.  How do AOL contact the owners of foo.com in
order to request that their secondary server is added to the list of
allowed IPs for zone transfers?  Other than that, AOL could then make
use of those servers as forwarders from their caching servers, I accept.

 Since they typically have
 several caching DNS servers, in theory there is a good chance 
 that each of
 them will get a different one of the RR Ips from their 
 secondary server, so
 in theory the RR goal is often achieved?

Assuming they are independent and not configured to use the same
forwarders, yes.  You might be surprised how few DNS servers an
organisation needs, though - Demon (my home ISP, and not a small one)
has two, and could probably get away with one except for redundancy.
I've not seen an ISP setup document yet that says to use primary and
secondary DNS of ns47.isp.net and ns32.isp.net - they're almost all ns0
and ns1 or ns1 and ns2, indicating that there are probably very few in
the organisation.

 For example I just used DOS
 nslookup to query my ISPs 2 main dns servers for 
 www.microsoft.com - they
 each returned a different address, although repeatedly 
 querying each one
 returns the same answer every time.  If I go through a local 
 caching DNS on
 my LAN, that returns a third address for MS - again, the same 
 one every time.

Yup.  So anyone using your ISP's DNS servers will get one of two IPs for
www.microsoft.com at present, out of the however many they have.  Lumpy
load balancing in action :-).

You likely haven't set up your own caching DNS to forward requests to
your ISP's DNS servers; otherwise you'd have had one of the same
answers.

 Basically, it's internal feature of the DNS server 
 to decide how
 it treats hostnames for which is has more than one IP.

Indeed.

 PS this has rekindled my interest so I just googled to 
 refresh my mind on
 the basics, this seems a useful page that explains what we 
 are talking about above.
 http://www.onjava.com/pub/a/onjava/2001/09/26/load.html 

Yes, that seems like a reaonable summary, although it doesn't really go
into the caching effects we're discussing here.

- Peter

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: [OT] RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-26 Thread Steve Kirk

 Yup.  So anyone using your ISP's DNS servers will get one of 
 two IPs for
 www.microsoft.com at present, out of the however many they 
 have.  Lumpy
 load balancing in action :-).

Yes true, hadn't thought of it like that.  Where a site has more Ips for a
host than an ISP has DNS servers, this is going to lead to lumpiness.

I guess this is one of the key reasons why RR DNS is only ever a poor man's
load balancer.  OK-ish if you have 2 Ips, gets worse if you have more.

 You likely haven't set up your own caching DNS to forward requests to
 your ISP's DNS servers; otherwise you'd have had one of the same
 answers.

Funnily enough I have, and I use Demon too.  I think my local DNS has maybe
kept an MS entry cached and it's refresh TTL is out of sync with the demon
DNS caches.  But what you say is right - if I restart that local DNS, it
will then get a fresh MS entry from one of the 2 cached at the Demon
servers.  In fact I just have, and it did.

Thanks again, that's clarified a few things I was a bit fuzzy on.

Sorry John for the slight off-topic diversion but I hope this diversion on
RR DNS might have been of interest to you too.



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-25 Thread Peter Crowther
 From: John MccLain [mailto:[EMAIL PROTECTED] 
 1) for DNS Request Distribution - I dont understand. The 
 browser sends a URL
 to the DNS, the DNS responds back with an IP address. But 
 what if at that IP
 address, you have a web server listening on port 80?

The browser talks to that Web server.

 If 
 Tomcat is at that
 address also, Tomcat would have to listen on another port. Can the DNS
 distribute back to the browser the IP Address AND  the Tomcat 
 port so the
 browser connects to Tomcat on a non port 80 port?

Only if your original URL uses the name:port notation - there is nothing
in this scheme to prevent that.

 Also, is there a way to
 setup the DNS to Round Robin or check server load on the 
 servers in the
 Tomcat cluster so it knows which Tomcat server ip:port to 
 send back.

No standard way afaik.  Worse, downstream DNS servers may (often do)
cache the returned IPs for up to a day despite any cache expiry you put
on them.  If (say) the AOL DNS servers all get the same IP address in
their cache, all your AOL visitors will visit the same IP address.

DNS is a very lumpy way of doing load balancing.

 OR does
 this whole thing imply that you have an IP for each web 
 server (IIS)

IP address yes; IIS depends on whether you want IIS or Tomcat at the
business end of the cluster.

 and
 each web server is tied to each server in the Tomcat cluster via a jk2
 redirector?

If you wish to use that architecture, yes.

 2)TCP NAT distribution - Does this mean that when the browser 
 connects to
 the IP address, that that connection is intercepted and the request is
 distributed to a server in the Tomcat cluster?

Yes.

 If this is the case, then
 what does the interception?

Generically, a router that has this capability.  It's that router that
also does the NATing.  Many mid- to high-end hardware routers and some
software routing packages can do this.

 and how do you configure that thing to use a
 specific algorithm (server load, Round Robin, etc..) to 
 choose which server
 to forward the request to?

That is router-specific.  There is no standard (afaik) for the servers
to return load information, so you're stuck with proprietary solutions
*or* the router doesn't load-balance.

 can it forward to an IP:PORT or does it have to
 forward to an IP

That is router-specific.  Given that the capability typically exists on
mid- to high-end routers, most will also have the capability to change
the internal port that is in use.

- Peter

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: DNS Request distribution and TCP NAT distribution For Tomcat Cluster

2005-05-25 Thread Steve Kirk
Peter,

I agree that

 DNS is a very lumpy way of doing load balancing. 

But your comments interested me.  Can I ask how sure you felt of what you
say here please: 

 No standard way afaik.  Worse, downstream DNS servers may (often do)
 cache the returned IPs for up to a day despite any cache expiry you put
 on them.  If (say) the AOL DNS servers all get the same IP address in
 their cache, all your AOL visitors will visit the same IP address.

I'm not for a minute suggesting that it is wrong :) and wouldn't dream of
doing so, because I don't know all the facts myself.  I tried to research it
but could not get to the bottom of it with any real info from ISPs (the
problem is that they seem to do their own thing to various extents).  I'm
just interested in comparing experiences/opinions.

I have set up roundrobin DNS for an ecommerce site in the past without any
complaints from users, and the  balance of load between a pair of clustered
servers seemed pretty even.  I would expect any DNS server run by an ISP
(such as AOL) to receive the zone records from SOA intact, i.e. these
major dns servers should know about all rr Ips for a given dns name, and
would therefore be able to RR distribute them to lower-tier DNS servers.  I
would have thought that the level at which DNS servers do not pick up the
fact that there is a RR DNS entry is where they do not do a zone transfer
from a primary DNS server - they simply act as a client and cache what they
get as a response, so they are unaware that there even are more than one IP.
I'm speculating that these minor DNS servers belong to small ISPs, or
private companies running their own DNS in-house?

So overall I guess I'm saying I'd be surprised if AOL's DNS servers only
cached one entry of a RR set for a DNS name.  What are your thoughts?

 -Original Message-
 From: Peter Crowther [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday 25 May 2005 17:15
 To: Tomcat Users List; [EMAIL PROTECTED]
 Subject: RE: DNS Request distribution and TCP NAT 
 distribution For Tomcat Cluster
 
 
  From: John MccLain [mailto:[EMAIL PROTECTED] 
  1) for DNS Request Distribution - I dont understand. The 
  browser sends a URL
  to the DNS, the DNS responds back with an IP address. But 
  what if at that IP
  address, you have a web server listening on port 80?
 
 The browser talks to that Web server.
 
  If 
  Tomcat is at that
  address also, Tomcat would have to listen on another port. 
 Can the DNS
  distribute back to the browser the IP Address AND  the Tomcat 
  port so the
  browser connects to Tomcat on a non port 80 port?
 
 Only if your original URL uses the name:port notation - there 
 is nothing
 in this scheme to prevent that.
 
  Also, is there a way to
  setup the DNS to Round Robin or check server load on the 
  servers in the
  Tomcat cluster so it knows which Tomcat server ip:port to 
  send back.
 
 No standard way afaik.  Worse, downstream DNS servers may (often do)
 cache the returned IPs for up to a day despite any cache 
 expiry you put
 on them.  If (say) the AOL DNS servers all get the same IP address in
 their cache, all your AOL visitors will visit the same IP address.
 
 DNS is a very lumpy way of doing load balancing.
 
  OR does
  this whole thing imply that you have an IP for each web 
  server (IIS)
 
 IP address yes; IIS depends on whether you want IIS or Tomcat at the
 business end of the cluster.
 
  and
  each web server is tied to each server in the Tomcat 
 cluster via a jk2
  redirector?
 
 If you wish to use that architecture, yes.
 
  2)TCP NAT distribution - Does this mean that when the browser 
  connects to
  the IP address, that that connection is intercepted and the 
 request is
  distributed to a server in the Tomcat cluster?
 
 Yes.
 
  If this is the case, then
  what does the interception?
 
 Generically, a router that has this capability.  It's that router that
 also does the NATing.  Many mid- to high-end hardware routers and some
 software routing packages can do this.
 
  and how do you configure that thing to use a
  specific algorithm (server load, Round Robin, etc..) to 
  choose which server
  to forward the request to?
 
 That is router-specific.  There is no standard (afaik) for the servers
 to return load information, so you're stuck with proprietary solutions
 *or* the router doesn't load-balance.
 
  can it forward to an IP:PORT or does it have to
  forward to an IP
 
 That is router-specific.  Given that the capability typically 
 exists on
 mid- to high-end routers, most will also have the capability to change
 the internal port that is in use.
 
   - Peter
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]