On 17.10.2012 04:47, Alan Dawson wrote:
Hi,

I'm at an educational establishment, with approx 2500 desktops.
We have had a restrictive web access policy implemented with
a web cache/filtering proxy appliance. User browsers are configured
by a PAC file and web proxy auto discovery. They authenticate against
the appliance with NTLM

Okay.


We plan on changing that policy to something much less restrictive,
but one of the technical issues we are expecting is an increase in
web traffic usage.

Currently we use 60Mb/s at peak times ( with 97% of that being http
traffic ), with our network connection being rated at 100Mb/s


Traffic speed in HTTP is best calculated and measured in requests/second. You can flood a 10Gbps link with two or three requests, or fetch >100K req/sec on a 10Mbps NIC. You can also have 2500 desktops producing 1-2 req/sec combined, or all producing those 100k req/sec each.

Your existing appliance should be able to give you a measurement of how many req/sec it currently handles for average and peak rates traffic. That measure along with connections/second will be useful in estimating Squid requirements...


We'd like to manage the amount of bandwidth that we use at our site
connecting to high traffic sites like youtube/vimeo/bbc, so that there is
always capacity for critical web applications, for example online
examinations.

The filter/proxy appliance does not have any options for limiting bandwidth

One of the ways we are investigating would be to use a squid web cache and delay_pools. We would try to identify high bandwidth/popular sites, and either use a PAC file so clients chose the bandwidth restricting cache, or use a cache chaining rule on the filter/proxy appliance, to pass requests
for particular sites to the bandwidth restricting cache.

If users connect to the squid cache directly we would authenticate using
Kerberos/NTLM for windows clients and Basic for others.

Does this approach seem valid ?

Sort of.

Definitely get away from NTLM. If you can start that migration now before even moving away from the appliance it will reduce the new changes being faced and simplify problem solving.

The ratio of appliance connection/sec rate to req/sec rate will tell you how efficiently the NTLM connections are currently being utilized (2 requests for handshake, 1...N for transaction data). So take the connection/sec count and double it, then subtract that from the req/sec rate to see how much Kerberos will hypothetically face - its rough guess since not all handshakes are 2 requests and the req/sec profile *will* change.

Squid delay pools are an old feature. I advocate proper TOS marking for bandwidth limiting where possible - since it can better account for traffic outside of Squid and local network conditions as well.


What kind of resource would the squid cache require ( RAM/CPU ... )

Whatever you can afford. These days the average consumer grade hardware can run Squid to the order of 200 req/sec - with some networks using good hardware measuring it up to ~2000 req/sec (including authentication, and some large ACLs).

Amos

Reply via email to