Re: DNS caches that support partitioning ?

2012-08-19 Thread Gary Buhrmaster
Re: LRU badness

One approach is called adaptive replacement cache (ARC) which is
used by Oracle/Sun in ZFS, and was used in PostgreSQL for a time
(and slightly modified to (as I recall) to be more like 2Q due to
concerns over the IBM patent on the algorithm).  Unfortunately,
we do not have any implementations of the OPT (aka clairvoyant)
algorithm, so something like 2Q might be an interesting approach
to experiment with.

Gary



Re: DNS caches that support partitioning ?

2012-08-19 Thread Jimmy Hess
On 8/19/12, Mark Andrews  wrote:
> As for the original problem.  LRU replacement will keep "hot" items in
> the cache unless it is seriously undersized.
[snip]
Well,  that's the problem.   Items that are not relatively "hot" will
be purged,  even though they may be very popular RRs.  Cache
efficiency is not defined as "keeping the hot items".

Efficient caching is defined as maximizing the hit percentage.
The DNS cache may have a load of DNSBL queries;  so all the entries
will be cold.  The problem in that case is not low utilization,  it's
high utilization of   queries that are useless to cache,  because
those questions will only be asked once,because  in LRU there's no
buffer maintaining an eviction history.

An example alternative strategy is, you have a Cache size of  XX  RR
buckets,   and you keep a list of YY  cache replacements  (not
necessarily the entire RRs, just label and a 1-byte count of
evictions),  and you have a cache policy of:  pick the entry whose TTL
has expired OR that has the lowest eviction count, that is
least-recently used or
has the least number of queries, to replace,.

--
-JH



Re: DNS caches that support partitioning ?

2012-08-19 Thread William Herrin
On Sun, Aug 19, 2012 at 5:37 PM, Mark Andrews  wrote:
> As for the original problem.  LRU replacement will keep "hot" items in
> the cache unless it is seriously undersized.

Maybe. This discussion is reminiscent of the Linux swappiness debate.

Early in the 2.x series Linux kernels, the guy responsible for the
virtual memory manager changed it to allow the disk cache to push
program code and data out of ram if all other disk cache was more
recently touched than the program data. Previously, the disk cache
would only consume otherwise free memory. Programs would only get
pushed to swap by memory demands from other programs.

The users went ape. Suddenly if you copied a bunch of data from one
disk to another, your machine would be sluggish and choppy for minutes
or hours afterward as programs recovered swapped pages from disk and
ran just long enough to hit the next section needing to be recovered
from swap. Some folks ditched swap entirely to get around the problem.

The guy insisted the users were wrong. He had the benchmarks,
meticulously collected data and careful analysis to prove that the
machines were more efficient with pure LRU swap. The math said he was
right. 2+2=4. But it didn't.

In the very common case of copy-a-bunch-of-files, simple LRU
expiration of memory pages was the wrong answer. It caused the machine
to behave badly. More work was required until a tuned and weighted LRU
algorithm solved the problem.


Whether John's solution of limiting the cache by zone subtree is
useful or not, he started an interesting discussion. Consider, for
example, what happens when you ask for www.google.com. You get a 7-day
CNAME record for a 5 minute www.l.google.com A record and the resolver
gets 2-day NS records for ns1.google.com, 4 day A records for
ns1.google.com, 2 day NS records for a.gtld-servers.com, etc.

Those authority records don't get touched again until www.l.google.com
expires. With a hypothetically simple least recently used (LRU)
algorithm, the 4 minute old A record for ns1.google.com was last
touched longer ago than the 3 minute old A record for
5.6.7.8.rbl.antispam.com. So when the resolver needs more cache for
4.3.2.1.rbl.antispam.com, which record gets kicked?

Then, of course, when www.l.google.com expires after five minutes the
entire chain has to be refetched because ns1.google.com was already
LRU'd out of the cache. This is distinctly slower than just refetching
www.l.google.com from the already known address of ns1.google.com and
the user sees a slight pause at their web browser while it happens.

Would a smarter, weighted LRU algorithm work better here? Something
where rarely used leaf data doesn't tend to expire also rarely used
but much more important data from the lookup chain?

Regards,
Bill Herrin


-- 
William D. Herrin  her...@dirtside.com  b...@herrin.us
3005 Crane Dr. .. Web: 
Falls Church, VA 22042-3004



Re: BGP Play broken?

2012-08-19 Thread John Kemp

OK.  I think we have something going at http://bgplay.routeviews.org/ again.

Thought I would change things up a bit since we were having problems
with some of
the route-views2 collector data.  So the setup now defaults to the data
from the
collectors: route-views.paix.routeviews.org,
route-views.eqix.routeviews.org,
route-views.saopaulo.routeviews.org, and route-views.sydney.routeviews.org.
There is 30-days of data up at the moment.  We'll try to make that
extend out
further if we can.  It updates at 15 minute intervals.

Again thanks to Roma Tre for allowing us to continue to run this service.
I would also add that I'm happy to see the historical BGPLAY at RIPE
offered as a
service.  Nice work there.

We will continue make a solid effort to support our instance of bgplay.

-- 
John Kemp (k...@routeviews.org)
RouteViews Engineer
NOC: n...@routeviews.org
MAIL: h...@routeviews.org
WWW: http://www.routeviews.org


On 8/15/12 3:11 PM, Anurag Bhatia wrote:
> Hi Frank
>
>
>
>
> On Wed, Aug 15, 2012 at 5:03 PM, Frank Bulk  wrote:
>
>> Here's another option: http://sga.ripe.net/hbgplay/
>>
>>
> This one looks good though I linked visible ASNs in BGPlay then blicking
> ones here (even "Show/Hide AS" button somehow fails for me).
>
>
>
> Thanks anyways. Will look forward for such other interesting analysis
> tools.
>
>> -Original Message-
>> From: joel jaeggli [mailto:joe...@bogus.com]
>> Sent: Wednesday, August 15, 2012 12:52 PM
>> To: Robert Glover
>> Cc: NANOG Mailing List
>> Subject: Re: BGP Play broken?
>>
>> On 8/15/12 10:28 AM, Robert Glover wrote:
>>> On 08/15/2012 10:16 AM, Anurag Bhatia wrote:
 Seems like BGP Play - http://bgplay.routeviews.org/ does not works
>> anymore?
 It is not accepting prefixes and gives error to check if prefix is
 announced globally or not.
>>> I sent an email to the contacts listed on the BGPlay feedback page back
>>> on July 20 letting them know it was broken.  I never received a
>>> response, so it has likely been broken since then.
>> bgpplay has in my understanding had several issues... the one on july
>> was addressed by migrating it to a higher capacity server. the more
>> recent incident started a couple of days ago and I belive people are
>> working on it.
>>> -Robert
>>>
>>>
>>
>>
>>
>>
>>
>




Re: DNS caches that support partitioning ?

2012-08-19 Thread Mark Andrews

In message , Chris Woodfiel
d writes:
> What Patrick said. For large sites that offer services in multiple data =
> centers on multiple IPs that can individually fail at any time, 300 =
> seconds is actually a bit on the long end.
> 
> -C

Which is why the DNS supports multiple address records.  Clients
don't have to wait a minutes to fallover to a second address.  One
doesn't have to point all the addresses returned to the closest
data center.  One can get sub-second fail over in clients as HE
code shows.

As for the original problem.  LRU replacement will keep "hot" items in
the cache unless it is seriously undersized.

Mark

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org



Re: DNS caches that support partitioning ?

2012-08-19 Thread Chris Woodfield
What Patrick said. For large sites that offer services in multiple data centers 
on multiple IPs that can individually fail at any time, 300 seconds is actually 
a bit on the long end.

-C

On Aug 18, 2012, at 3:43 PM, Patrick W. Gilmore  wrote:

> On Aug 18, 2012, at 8:44, Jimmy Hess  wrote:
> 
>> And I say that, because some very popular RRs have insanely low TTLs.
>> 
>> Case in point:
>> www.l.google.com.300INA74.125.227.148
>> www.l.google.com.300INA74.125.227.144
>> www.l.google.com.300INA74.125.227.146
>> www.l.google.com.300INA74.125.227.145
>> www.l.google.com.300INA74.125.227.147
>> www.l.google.com.300INA74.125.227.148
> 
> Different people have different points of view.
> 
> IMHO, if Google losses a datacenter and all users are stuck waiting for a 
> long TTL to run out, that is Very Bad.  In fact, I would call even 2.5 
> minutes (average of 5 min TTL) Very Bad.  I'm impressed they are comfortable 
> with a 300 second TTL.
> 
> You obviously feel differently.  Feel free to set your TTL higher.
> 
> -- 
> TTFN,
> patrick
> 
>