Yes you are right indeed, I had misunderstood want you meant, sorry for my 
confusion about that.
I was talking about the problem caused by dns caching, as round-robin does not 
happen for a cached result set of 14 IP's.
A cached dns result will always return the 14 set of IP's in the same exact 
order until TTL expires and I believe the first IP in the list is the most 
unlucky one, and it becomes even worse with a TTL of 2700 sec.

So why not modify bind source code to directly parse a big list of IP's and 
return a different group of 14 or so IP's in each query, and also start using a 
lower TTL?
Or maybe setting 'pool.ntp.org' with a round-robin list of CNAME's with very 
low TTL instead of A records, pointing to '0.pool.ntp.org', '1.pool.ntp.org' 
and so on, and these ones having a higher TTL list of 14 IP's, this would spare 
bandwith in dns traffic as 'N.pool.ntp.org' list of 14 IP's would remain cached 
with the high TTL and the authoritative dns servers would mostly handle and 
repond to queries returning the CNAME's instead of 14 IP's?

I also agree with you in that these spikes from TT customers are not the fault 
of just TT, as I have told in another thread.
It is my opinion however that it is evident that responsibles at TT are 
configuring their customers with ntp servers using the pool instead of using 
their own ntp servers which I believe would be a much more ethical behaviour 
(and technically more adequate I think) for such large ISP, and so I believe 
that part of this specific problem still falls on TT.

Rui


----- Original Message ----- 
From: "Rob Janssen" <[EMAIL PROTECTED]>
To: "Rui Ferreira" <[EMAIL PROTECTED]>
Cc: <[email protected]>
Sent: Tuesday, August 07, 2007 9:18 PM
Subject: Re: [time] What is happening here?


Rui Ferreira wrote:
>
> I believe you are somewhat wrong, as the dns servers actually make 
> round-robin on a per request basis.
> If you try "dig pool.ntp.org. @a.ntpns.org" several repeated times you will 
> see the round-robin working on a per request basis, that is, you will see the 
> returned ip's rotating on each request.
> The problem that you are talking about is related to the TTL of the results, 
> 2700 sec at this moment, that is, the result will remain in dns cache for 
> 2700 sec.
>   
No.  The problem is that the DNS returns 14 addresses from the pool for 
each domain name within the pool (e.g. pool.ntp.org, 
europe.pool.ntp.org, nl.pool.ntp.org) even when that part of the pool 
has many more than 14 servers.  The set of 14 servers remains the same 
for one hour, only the sequence within this set of 14 rotates.
So, when there are 500 servers in the pool and a large group of users 
tries to get time using simple NTP (a single request to retrieve the 
current time), all the requests from that large group of users go to 
only 14 out of the 500 servers.
The servers in that group of 14 see a "spike", and the remaining 486 
servers have nothing to complain about.

An hour later, 14 different servers see a "spike".
That is why I claim this spike is not caused by Türk Telecom but by our 
DNS system.  When the DNS would really rotate over all 500 servers, the 
load would be distributed over 500 instead of 14 servers and the spike 
would be 35 times lower.

Of course there is the problem that DNS typically uses caching servers 
and so you cannot rotate as fast as you would like.

Rob
_______________________________________________
timekeepers mailing list
[email protected]
https://fortytwo.ch/mailman/cgi-bin/listinfo/timekeepers

_______________________________________________
timekeepers mailing list
[email protected]
https://fortytwo.ch/mailman/cgi-bin/listinfo/timekeepers

Reply via email to