[squid-users] Managing clusters of siblings (squid2.7)

Chris Hostetter Mon, 28 Sep 2009 15:04:55 -0700


Background Information...

My company currently runs several "clusters" of application servers behindload balancers, which are each in turn sitting behind a "cluster" of squidmachines configured as accelerators. each squid cluster is then sittingbehind a load balancer that is hit by our clients.

To elaborate: The hostname appAA resolves to a load balancer which proxiesto appAA-squid1, appAA-squid2, appAA-squid2, etc... Each of theappAA-squidX machines is configured as a standalone accelerator (usingcache_peer ... parent no-query originserver) for appAA-backend.appAA-backend resolves to a load balancer which proxies to appAA-backend1,appAA-backend2, appAA-backend3, etc... Likewise for appBB, appCC, appDD,etc...

None of these squid instances know anything about each other. in the caseof appAA-squidX vs appBB-squidX this is a good thing, because the entirepoint of isolating these apps is for QoS commitments, and ensuring thatheavy load or catastrophic failure on one app doesn't affect another app.

In the case of appAA-squidX vs appAA-squidY it definitely seems like cachepeering would be advantageous here.



The Problem(s)...

Our operations team is pretty adamant about software/configs deployed toboxes in a clustering needing to be the same for every box in the cluster.The goal is understandable: they don't want to need custom install stepsfor every individual machine. So while my dev setup of a 5 machine squidcluster each with 4 distinct "cache_peer ... sibling" lines works great sofar, i can't deploy a unique squid.conf for each machine in a cluster.

I could probably put a hack into our build system to check the currenthostname at installation time and remove any cache_peer lines refering tothat hostname -- but before i jumped through those hoops i wanted tosanity check that there wasn't an easier way to do this in squid.

is there any easy way to reuse the same cache_peer config options onmultiple instances, but keep squid smart enough that it doesn't bothertrying to peer with itself?

(I had a glimmer of an idea about using ACL rules for this, but didn'twork it through all the way because it seemed like at best that wouldcause squid to deny requests from itself, not prevent it from attemptingthe request in the first place)

I have a hard time imaging that i'm the first person to have this problem,but i couldn't find any obvious solutions in the mail archives.

A slightly bigger problem is what to do when the cluster changes, eitherbecause a machine is removed for maintenance issues or because a machineis added due to because of increases in load. In our current setup thisis a no-brainer: tell the load balancer when you add/remove a machine andeverything just works -- none of the boxes know anything about each other,and they all run identical configs.

In order to setup sibling peering, it seems like i would need todeploy/reload configs (with an updated list of cache_peer directives) toevery machine in the cluster anytime a new box is added or removed inorder for all of the siblings to know about each other.

Is there an easier way to coordinate the "discovery" of new siblingsautomatically?

An ideal situation would be to use a DNS hostname in the cache_peer linethat resolves to multiple IPs and have squid re-resolve the hostnameperiodically and update the list of peers based on *all* the addressesassociated with that name -- but from what i can tell squid will justpicking a single address (DNS Round-Robin style).

Any advice or suggestions for managing peers like this would beappreciated.



-Hoss

[squid-users] Managing clusters of siblings (squid2.7)

Reply via email to