On Mar 16, 2009, at 9:07 PM, Amos Jeffries wrote:

Hi,

I'm looking into setting up cache peering - I currently have small
sets of reverse-proxy squids sitting behind a load balancer, with no
URI hashing or other content-based switching in play (thanks to a nice
bug/feature in Foundry's IOS that prevents "graceful" rehashing when
new servers are added to a VIP..) So I'm looking at other ways to
scale horizontally our cache capacity (and increase hit rates as I go)
- so cache-peering in proxy-only mode seems to be a good solution

Due to various reasons, it's looking like cache digests are going to
be the best way to go in our environment (Option #2 is multicast, but,
ew). However, one big question I have is this - are cache digests
intended to replace, or to supplement, normal ICP cache query behavior?

I believe it's replace. Though I may be wrong. I have not seen both in
action together yet.


Answered my own question with some lab testing - cache digests are *supplemental* to normal ICP behavior. When receiving a URL request that's an internal miss, it will look up cache digests first, then do an ICP query, then query direct. This is the behavior I was hoping it would have :) Even works with multicast ICP, which was a pleasant surprise.

the mgr:peer_select even gives you a nice statistic as to how many queries were cache-digest hits vs. ICP hits:

...
Algorithm usage:
Cache Digest:    2390 ( 62%)
Icp:             1457 ( 38%)
Total:           3847 (100%)



For example, let's say squid A and squid B exchange cache digests
every 10 minutes. squid A has just retrieved a cache digest from squid
B, and then gets a new request for an object one minute after the
cache exchange. One minute later (8 minutes before the next digest
exchange), squid A gets a request for the same URL. This object is a
local miss to squid A, but it in-cache for squid B although it's not
in the latest digest that squid A has received from B.

Will squid A either 1. Do a normal ICP query to squid B due to the
fact that it's a cache miss, or 2. Presume that squid B doesn't have
the object since it wasn't in the last digest, and retrieve it itself?
In other words, do digest exchanges preclude ICP queries for objects
requests that are local cache misses and are not in the most-recent
cache digests that a squid has received?

Personally, I'm hoping the answer is #1, as #2 can easily result in
duplicated content between the squids, which is exactly what I'm
trying to avoid here.

2-layer CARP mesh is the 'standard' topology recommended for this since Wikipedia had such success with it. Where the underlayer does all caching
and the load balancing Squid overlayer splits requests into to the
underlayer using CARP.


I was really hoping I could do this with our existing load balancers, but Foundry boned the pony on their content-hashing functionality - there's no way to do a "graceful" hash redistribution when adding a new real server to the pool.

Amos



Reply via email to