RE: [squid-users] Is my Squid heavily loaded?

2011-03-14 Thread Amos Jeffries

On Mon, 14 Mar 2011 18:12:27 +0530, Saurabh Agarwal wrote:

Thanks Amos. I will try doing those different sizes tests.

Some more observations on my machine. If I don't transfer those 200
HTTP Files for the first time in parallel but sequentially one by one
using wget and after this if I use my other script to get these 200
files in parallel from Squid then memory usage is allright. Squid
memory usage remains under 100MB. I think for the first time transfer
there is even more disk usage like save the files to disk and then
read all of them parallel from disk. Also I think there should be 
lots
of socket buffer space being used as well by Squid for each client 
and

server socket.

Regarding cache_dir usage what do you mean by "one cache_dir entry
per spindle". I have only one disk and one device mapped partition
with ext3 file system.


The config file you showed had 3 cache_dir on that 1 disk. This is bad. 
Each cache_dir has N AIO threads (16, 32 or 64 by default IIRC) all 
trying to read/write from random portions of the disk. Squid and AIO 
scheduling does some optimization towards serialising access to the base 
disk, but that does not work well when there are multiple independent 
cache_dir state handlers.


Amos



RE: [squid-users] Is my Squid heavily loaded?

2011-03-14 Thread Saurabh Agarwal
Thanks Amos. I will try doing those different sizes tests.

Some more observations on my machine. If I don't transfer those 200 HTTP Files 
for the first time in parallel but sequentially one by one using wget and after 
this if I use my other script to get these 200 files in parallel from Squid 
then memory usage is allright. Squid memory usage remains under 100MB. I think 
for the first time transfer there is even more disk usage like save the files 
to disk and then read all of them parallel from disk. Also I think there should 
be lots of socket buffer space being used as well by Squid for each client and 
server socket.

Regarding cache_dir usage what do you mean by "one cache_dir entry per 
spindle". I have only one disk and one device mapped partition with ext3 file 
system.

Regards,
Saurabh

-Original Message-
From: Amos Jeffries [mailto:squ...@treenet.co.nz] 
Sent: Monday, March 14, 2011 5:26 PM
To: squid-users@squid-cache.org
Subject: Re: [squid-users] Is my Squid heavily loaded?

On 15/03/11 00:02, Saurabh Agarwal wrote:
> Hi All
>
> I am trying to load test squid using this simple test. From a single
> client machine I want to simultaneously download 200 different HTTP
> files of 10MB each in a loop over and over again. I see that within 5
> minutes squid process size goes beyond 250MB. These 10MB files are
> all cachable and return a TCP_HIT for the second time onwards. There
> are other processes running and I want to limit squid memory usage to
> 120MB. Hard disk partition allocated to Squid is of 10GB and is made
> using device-mapper. I am using 3 cache_dir as mentioned below. How
> can I control Squid memory usage in this case? Below is my portion of
> my squid.conf.

200 files @ 10MB -> up to 2GB of data possibly in memory simultaneously.

It is easy to see why "squid process size goes beyond 250MB" easily.


You have cache_mem of 8 MB. Which means Squid will push these objects to 
disk after the first use. From then on what you are testing is the rate 
at which Squid can load them from disk into the network. It is quite 
literally a read from disk into buffer, call function which immediately 
writes direct from buffer to network. Done ins "small" chunks of 
whatever the system disk I/O page size is (default 4KB but could be more).

  The real speed bottleneck in Squid are the HTTP processing. Which does 
a lot of CPU intensive small steps of parsing and data copying. When 
there are a lot of new requests arriving it sucks CPU time away from 
that speedy read->write byte pumping loop.

Your test is a classic check for Disk speed limits in Squid.

The other tests you need to check performance are:
  * numerous requests for few medium sized objects (which can all fit in 
memory together, headers of ~10% or less the total object). Testing the 
best-case memory-hit speed.
  * numerous requests for very small objects (one packet responses sort 
of size). Testing the worst-case HTTP parser limits.
  * parallel requests for numerous varied objects (too many to fit in 
memory). Testing a somewhat normal traffic speed expectations.

There is a tool called WebPolygraph which does some good traffic 
measurements.

>
>  access_log
> /squid/logs/access.log  squid cache_log /squid/logs/cache.log
>
> cache_mem 8 MB cache_dir aufs /squid/var/cache/small 1500 9 256
> max-size=1 cache_dir aufs /squid/var/cache/medium 2500 6 256
> max-size=2000 cache_dir aufs /squid/var/cache/large 6000 3 256
> max-size=1 maximum_object_size 100 MB log_mime_hdrs off
> max_open_disk_fds 400 maximum_object_size_in_memory 8 KB
>
> cache_store_log none pid_filename /squid/logs/squid.pid debug_options
> ALL,1 ---
>
> Regards, Saurabh

Um, your use of cache_dir is a bit odd.
  *one* ufs/aufs/diskd cache_dir entry per disk spindle. Otherwise your 
speed is lower due to disk I/O collisions between the cache_dir (your 
test objects are all the same size and so will not reveal this behaviour).
  Also, leave some disk space for the cache log and journal overheads. 
Otherwise your Squid will crash with "unable to write to file" errors 
when the cache starts to get nearly full.

Amos
-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.11
   Beta testers wanted for 3.2.0.5


Re: [squid-users] Is my Squid heavily loaded?

2011-03-14 Thread Amos Jeffries

On 15/03/11 00:02, Saurabh Agarwal wrote:

Hi All

I am trying to load test squid using this simple test. From a single
client machine I want to simultaneously download 200 different HTTP
files of 10MB each in a loop over and over again. I see that within 5
minutes squid process size goes beyond 250MB. These 10MB files are
all cachable and return a TCP_HIT for the second time onwards. There
are other processes running and I want to limit squid memory usage to
120MB. Hard disk partition allocated to Squid is of 10GB and is made
using device-mapper. I am using 3 cache_dir as mentioned below. How
can I control Squid memory usage in this case? Below is my portion of
my squid.conf.


200 files @ 10MB -> up to 2GB of data possibly in memory simultaneously.

It is easy to see why "squid process size goes beyond 250MB" easily.


You have cache_mem of 8 MB. Which means Squid will push these objects to 
disk after the first use. From then on what you are testing is the rate 
at which Squid can load them from disk into the network. It is quite 
literally a read from disk into buffer, call function which immediately 
writes direct from buffer to network. Done ins "small" chunks of 
whatever the system disk I/O page size is (default 4KB but could be more).


 The real speed bottleneck in Squid are the HTTP processing. Which does 
a lot of CPU intensive small steps of parsing and data copying. When 
there are a lot of new requests arriving it sucks CPU time away from 
that speedy read->write byte pumping loop.


Your test is a classic check for Disk speed limits in Squid.

The other tests you need to check performance are:
 * numerous requests for few medium sized objects (which can all fit in 
memory together, headers of ~10% or less the total object). Testing the 
best-case memory-hit speed.
 * numerous requests for very small objects (one packet responses sort 
of size). Testing the worst-case HTTP parser limits.
 * parallel requests for numerous varied objects (too many to fit in 
memory). Testing a somewhat normal traffic speed expectations.


There is a tool called WebPolygraph which does some good traffic 
measurements.




 access_log
/squid/logs/access.log  squid cache_log /squid/logs/cache.log

cache_mem 8 MB cache_dir aufs /squid/var/cache/small 1500 9 256
max-size=1 cache_dir aufs /squid/var/cache/medium 2500 6 256
max-size=2000 cache_dir aufs /squid/var/cache/large 6000 3 256
max-size=1 maximum_object_size 100 MB log_mime_hdrs off
max_open_disk_fds 400 maximum_object_size_in_memory 8 KB

cache_store_log none pid_filename /squid/logs/squid.pid debug_options
ALL,1 ---

Regards, Saurabh


Um, your use of cache_dir is a bit odd.
 *one* ufs/aufs/diskd cache_dir entry per disk spindle. Otherwise your 
speed is lower due to disk I/O collisions between the cache_dir (your 
test objects are all the same size and so will not reveal this behaviour).
 Also, leave some disk space for the cache log and journal overheads. 
Otherwise your Squid will crash with "unable to write to file" errors 
when the cache starts to get nearly full.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE9 or 3.1.11
  Beta testers wanted for 3.2.0.5


[squid-users] Is my Squid heavily loaded?

2011-03-14 Thread Saurabh Agarwal
Hi All

I am trying to load test squid using this simple test. From a single client 
machine I want to simultaneously download 200 different HTTP files of 10MB each 
in a loop over and over again. I see that within 5 minutes squid process size 
goes beyond 250MB. These 10MB files are all cachable and return a TCP_HIT for 
the second time onwards. There are other processes running and I want to limit 
squid memory usage to 120MB. Hard disk partition allocated to Squid is of 10GB 
and is made using device-mapper. I am using 3 cache_dir as mentioned below. How 
can I control Squid memory usage in this case? Below is my portion of my 
squid.conf.


access_log /squid/logs/access.log  squid
cache_log /squid/logs/cache.log

cache_mem 8 MB
cache_dir aufs /squid/var/cache/small 1500 9 256 max-size=1
cache_dir aufs /squid/var/cache/medium 2500 6 256 max-size=2000
cache_dir aufs /squid/var/cache/large 6000 3 256 max-size=1
maximum_object_size 100 MB
log_mime_hdrs off
max_open_disk_fds 400
maximum_object_size_in_memory 8 KB

cache_store_log none
pid_filename /squid/logs/squid.pid
debug_options ALL,1
---

Regards,
Saurabh