We've got squid accelerator setups, and trying to get away from using layer7 
balancers for URL hashing, so we thought we'd use CARP.
The basics are: a pool of 18 servers, balanced round-robin. We have CARP squid 
instances on port 80, a caching squid on 81.

Our carp cache_peer lines look like:
cache_peer 69.147.123.121 parent 81 7 carp no-query  name=photocache201 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.122 parent 81 7 carp no-query  name=photocache202 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.123 parent 81 7 carp no-query  name=photocache203 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.124 parent 81 7 carp no-query  name=photocache204 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.125 parent 81 7 carp no-query  name=photocache205 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.126 parent 81 7 carp no-query  name=photocache206 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.32 parent 81 7 carp no-query  name=photocache207 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.33 parent 81 7 carp no-query  name=photocache208 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.34 parent 81 7 carp no-query  name=photocache209 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.35 parent 81 7 carp no-query  name=photocache210 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.36 parent 81 7 carp no-query  name=photocache211 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.37 parent 81 7 carp no-query  name=photocache212 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.38 parent 81 7 carp no-query  name=photocache213 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.39 parent 81 7 carp no-query  name=photocache214 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.40 parent 81 7 carp no-query  name=photocache215 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.41 parent 81 7 carp no-query  name=photocache216 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.42 parent 81 7 carp no-query  name=photocache217 
monitorurl=/index.htm monitorinterval=60 
cache_peer 69.147.123.43 parent 81 7 carp no-query  name=photocache218 
monitorurl=/index.htm monitorinterval=60 

The carp instances get an equal balance from the load balancer, but we don't 
see an equal balance across the caching
squid instances. Instead, we see a distribution that looks exactly like the 
sequence of hosts listed as cache_peers. For
example, photocache201 gets the most requests, and it decreases down the line, 
and photocache218 gets the *least* requests.

Is this expected? How can we get a real balance?

The cache manager confirms what we're seeing:

$squidclient -p 80 cache_object://127.0.0.1/carp
HTTP/1.0 200 OK
Server: squid/2.7.STABLE5
Date: Fri, 19 Dec 2008 18:52:21 GMT
Content-Type: text/plain
Expires: Fri, 19 Dec 2008 18:52:21 GMT
X-Cache: MISS from photocache201.flickr
X-Cache-Lookup: MISS from photocache201.flickr
Via: 1.0 photocache201.flickr (squid/2.7.STABLE5)
Connection: close

                Hostname       Hash Multiplier     Factor     Actual
             apache_peer          0   0.000000   0.000000   0.007863
           photocache201   b7d71c0d   1.000000   0.055556   0.162550
           photocache202   e4836670   1.000000   0.055556   0.133271
           photocache203   114fb0d4   1.000000   0.055556   0.072396
           photocache204   3e1bfb37   1.000000   0.055556   0.076387
           photocache205   6ac8459a   1.000000   0.055556   0.064102
           photocache206   97948ffd   1.000000   0.055556   0.045340
           photocache207   c440da60   1.000000   0.055556   0.058199
           photocache208   f10d24c3   1.000000   0.055556   0.039018
           photocache209   1dd96f27   1.000000   0.055556   0.050036
           photocache210   b7d0820d   1.000000   0.055556   0.043762
           photocache211   e49ccc70   1.000000   0.055556   0.038504
           photocache212   114916d4   1.000000   0.055556   0.032422
           photocache213   3e156137   1.000000   0.055556   0.031728
           photocache214   6ac1ab9a   1.000000   0.055556   0.026673
           photocache215   978df5fd   1.000000   0.055556   0.028562
           photocache216   c45a4060   1.000000   0.055556   0.030378
           photocache217   f1068ac3   1.000000   0.055556   0.024259
           photocache218   1dd2d527   1.000000   0.055556   0.034549

(the apache_peer is a local peer on each box for origin server healthchecks, it 
can be ignored)

To confirm this, we even reversed the cache_peer lines; it results in 
photocache218 getting the most, photocache201 getting
the least. :)

What gives?
--john



      

Reply via email to