[squid-users] SMP vs Single Process Performance

2013-03-18 Thread Sokvantha YOUK
Dear All,

I am going to build squid to supports 1 Gbps traffic on Centos 6 x64
bits. I have done some studying on getting squid to support this high
load traffic but I still not clear on some points below

1. Should I use SMP feature with 6 workers support and AUFS file system?
2. Or single squid process provide better performance and reliable against SMP?

Please find following configuration on my squid server, kindly advice
me on idea for cache_dir per squid process, is it good to choose this
idea . Beside, with this configuration, I am having issues when
opening http://microsoft.com, sometimes it gives me error message
Your current User-Agent string appears to be from an automated
process, if this is incorrect, please click this link:United States
English Microsoft Homepage

My squid compiled option:
/usr/local/squid/sbin/squid -v
Squid Cache: Version 3.3.3
configure options:  '--sysconfdir=/etc/squid'
'--enable-follow-x-forwarded-for' '--enable-snmp'
'--enable-linux-netfilter' '--enable-http-violations'
'--enable-delay-pools' '--enable-storeio=diskd,aufs,ufs'
'--with-large-files' '--enable-removal-policies=lru,heap'
'--enable-ltdl-convenience' '--with-logdir=/var/log/squid'
'--enable-wccpv2' '--with-default-user=squid'
'--enable-log-daemon-helpers' '--enable-build-info'
'--enable-url-rewrite-helpers'

My Squid.conf

###
# Server Characteristics
###
## We don't use this port in this fully transparent mode
http_port 3128
http_port 3129  tproxy disable-pmtu-discovery=always
#icp_port 3130
cache_mgr ca...@example.com
visible_hostname cache2.example.com
#--Leave coredumps in the first cache dir
coredump_dir /usr/local/squid/var/cache
cache_effective_user squid
cache_effective_group squid


###
## Performance related config
###
max_filedescriptors 65536
max_open_disk_fds 65536
dns_v4_first on
negative_ttl 3 minutes
positive_dns_ttl 15 hours
negative_dns_ttl 30 seconds

## Setting the size of DNS cache
ipcache_size 15000
ipcache_low 95
ipcache_high 97
fqdncache_size 1
cache_mem 2000 MB
memory_pools on
## Cache Object Size limits
minimum_object_size 0 KB
maximum_object_size  700 MB
maximum_object_size_in_memory 1024 KB

#--Cache Replacement Policies
memory_replacement_policy lru
cache_replacement_policy heap LFUDA

#--Setting Limits on object replacement
cache_swap_low 96
cache_swap_high 97

## Spoofing enable
via off
forwarded_for delete
follow_x_forwarded_for deny all
## Remove X-Cache replied from Server
reply_header_access X-Cache deny all

##
## Squid Logging
##
## let access_log_daemon handle IO
access_log none
cache_log /var/log/squid/cache.log
cache_store_log none
logfile_rotate 30

##
## Refresh pattern
##

## Microsoft
refresh_pattern -i
microsoft.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip) 4320 80%
43200 reload-into-ims
refresh_pattern -i
windowsupdate.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip) 4320 80%
43200 reload-into-ims
refresh_pattern -i
my.windowsupdate.website.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip)
4320 80% 43200 reload-into-ims

##images facebook
refresh_pattern ((facebook.com)|(173.252.110.27)).*\.(jpg|png|gif)
 10800 80% 10800 ignore-reload  override-expire ignore-no-store
refresh_pattern -i \.fbcdn.net.*\.(jpg|gif|png|swf|mp3)
  10800 80% 10800 ignore-reload  override-expire ignore-no-store
refresh_pattern  static\.ak\.fbcdn\.net*\.(jpg|gif|png)
  10800 80% 10800 ignore-reload  override-expire ignore-no-store
refresh_pattern ^http:\/\/profile\.ak\.fbcdn.net*\.(jpg|gif|png)
10800 80% 10800 ignore-reload  override-expire ignore-no-store

## ANTI VIRUS
refresh_pattern guru.avg.com/.*\.(bin)  10800 80%
10800 ignore-no-store  ignore-reload  reload-into-ims
refresh_pattern (avgate|avira).*(idx|gz)$
10800 80% 10800 ignore-no-store  ignore-reload  reload-into-ims
refresh_pattern kaspersky.*\.avc$
10800 80% 10800 ignore-no-store  ignore-reload  reload-into-ims
refresh_pattern kaspersky
10800 80% 10800 ignore-no-store  ignore-reload  reload-into-ims
refresh_pattern update.nai.com/.*\.(gem|zip|mcs)
10800 80% 10800 ignore-no-store  ignore-reload  reload-into-ims
refresh_pattern ^http:\/\/liveupdate.symantecliveupdate.com.*\(zip)
 10800 80% 10800 ignore-no-store  ignore-reload  reload-into-ims

##Musics
refresh_pattern -i
\.(mp2|mp3|mid|midi|mp[234]|wav|ram|ra|rm|au|3gp|m4r|m4a)(\?.*|$)
10080   90%   43200 override-expire ignore-reload reload-into-ims
ignore-private
## Videos
refresh_pattern -i
\.(mpg|mpeg|mp4|m4v|mov|avi|asf|wmv|wma|dat|flv|swf)(\?.*|$) 10080
90%   43200 override-expire ignore-reload reload-into-ims

[squid-users] Re: SMP vs Single Process Performance

2013-03-18 Thread babajaga
Some aspects:
- You are only using aufs. I consider it good for larger objects, but
regarding small ones (=32kb), rock should be faster. So I suggest some
splitting of cache_dirs based on obj size.
- Be careful when setting up the filesystem for your cache_dirs on disk. I
made the experience, that this will have a huge impact on performance. I
consider HDDs reliable, so I take the risk of losing some cache content in
case of diskfailure (which happend very seldom to me) and use an ext4-fs,
stripped down to the very basics (no journal, timestamps etc.).
-AFAIK, SMP does not do shared aufs. That means, in your config you take the
risk of having the same file cached multiple times, in different cache_dirs.
So you might consider having multiple workers for rock-dir, but only one for
the larger stuff, stored using one single HUGE aufs. However, will need
configure opt 
'--enable-async-io=128'
- To smooth the access from clients, you might consider using delay-pools,
to limit the risk of some bad guys sucking your bandwidth by having an upper
limit on download spead.








--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/SMP-vs-Single-Process-Performance-tp4659034p4659035.html
Sent from the Squid - Users mailing list archive at Nabble.com.


[squid-users] authenticate access to reverse proxy

2013-03-18 Thread James Harper
Say I have a squid reverse proxy with https enabled on it at 
https://apps.example.com. This serves a number of apps including:

/owa - outlook web access
/rpc - ms terminal server gateway
/intranet
/bugtracker
/svn - svn anon browser access
/procedures

These are spread across a bunch of completely different servers (some linux, 
some windows) and works really really well. It has been decided that some of 
the individual applications are not secure enough. /owa, /rpc, and /bugtracker 
are fine, while /intranet,  /procedures, and /svn are not. I have set up acls 
to deny external access to the insecure apps but now want to put some front end 
security on them such that when a user first tries to access one with a browser 
they are redirected and required to sign in to a web forms based page. The idea 
I have for this is:

. create an sqlite database in /var/run or some other throwaway location
. redirect users using deny_info to the sign in page (php)
. on successful authentication, set a cookie (some random string eg md5 hash of 
username, password, and time) and create a corresponding entry in the database 
then redirect user to original page (only possible with squid 3.2.x I 
believe...)
. create an external acl helper that is passed in the request header 
corresponding to the cookie, decodes the cookie value from the header, and 
looks up the entry in the database (and maybe timestamp last access). If 
present, report OK
. create a cron job nightly (or hourly or whatever) to delete stale records 
from the database to keep the size reasonable

The cookie here only serves as a lookup into the database, and I believe will 
be supplied by the browser on any user request.

The number of users is under 100 and of those the number actually using 
external access is likely to only be around 10-20 at this time, so I'm not too 
worried about scalability but I guess if I'm making any mistakes now is the 
time to correct them.

Any comments before I write too much code would be greatly appreciated!

Thanks

James



Re: [squid-users] Re: SMP vs Single Process Performance

2013-03-18 Thread Giorgi Tepnadze

On 18.03.2013 14:06, babajaga wrote:


So you might consider having multiple workers for rock-dir, but only one for
the larger stuff, stored using one single HUGE aufs.


Hello, Very interesting recommendations for me also. One question how to 
configure squid for situation described above by you?
Let me describe this situation as I see it, requests for big objects 32KB go 
to one worker with aufs and other requests with smaller objects go to another 
worker with rock.
So can squid select different workers by looking at HTTPContent-Lengthresponse 
header?



[squid-users] RE: ACL wildcard?

2013-03-18 Thread Sébastien WENSKE
Hey,

It would be great if this feature becomes available !!!

acl aclname_1 type_1
acl aclname_2 type_2
acl aclname_3 type_3
acl aclname_4 type_4
[...]
http_access allow|deny aclname_*

Cheers!

-Message d'origine-
De : Nick Cairncross [mailto:nick.cairncr...@condenast.co.uk] 
Envoyé : jeudi 11 mars 2010 18:41
À : squid-users@squid-cache.org
Objet : [squid-users] ACL wildcard?

Hi all,

Just a quick question today..: In a bid to keep to some standards my ACLs
all follow similar naming conventions :

FILETYPE_EXE_[object] - e.g. FILE_TYPE_EXE_Users, FILE_TYPE_EXE_Hosts,
FILE_TYPE_EXE_IPAddresses FILETYPE_MP3_[object] - e.g. FILE_TYPE_MP3_Users,
FILE_TYPE_MP3_Hosts, FILE_TYPE_MP3_IPAddresses FILETYPE_ZIP_[object] - e.g.
FILE_TYPE_ZIP_Users, FILE_TYPE_ZIP_Hosts, FILE_TYPE_ZIP_IPAddresses

Instead of repeating the deny_info entry three times for each of these, is
it possible to use a wildcard for one? If so.. What is it?:

deny_info CUSTOM_FILEBLOCKED FILETYPE_{wildcard}

Thanks,

Nick




** Please consider the environment before printing this e-mail **

The information contained in this e-mail is of a confidential nature and is
intended only for the addressee.  If you are not the intended addressee, any
disclosure, copying or distribution by you is prohibited and may be
unlawful.  Disclosure to any party other than the addressee, whether
inadvertent or otherwise, is not intended to waive privilege or
confidentiality.  Internet communications are not secure and therefore Conde
Nast does not accept legal responsibility for the contents of this message.
Any views or opinions expressed are those of the author.

Company Registration details:
The Conde Nast Publications Ltd
Vogue House
Hanover Square
London W1S 1JU

Registered in London No. 226900


smime.p7s
Description: S/MIME cryptographic signature


[squid-users] rock squid -k reconfigure

2013-03-18 Thread Alexandre Chappaz
Hi,

Im am using squid 3.2.8-20130304-r11795 with SMP  a rock dir configured.
After a fresh start, cachemanager:storedir reports :

by kid5 {
Store Directory Statistics:
Store Entries  : 52
Maximum Swap Size  : 8388608 KB
Current Store Swap Size: 28176.00 KB
Current Capacity   : 0.34% used, 99.66% free

Store Directory #0 (rock): /var/cache/squid/mem/
FS Block Size 1024 Bytes

Maximum Size: 8388608 KB
Current Size: 28176.00 KB 0.34%
Maximum entries:262143
Current entries:   880 0.34%
Pending operations: 1 out of 0
Flags:
} by kid5



for the rock cache_dir.


After a squid -k reconfigure, without any change in the squid.conf,
the cachemanager is reporting this :

by kid5 {
Store Directory Statistics:
Store Entries  : 52
Maximum Swap Size  : 0 KB
Current Store Swap Size: 0.00 KB
Current Capacity   : 0.00% used, 0.00% free

} by kid5



Is this only a problem with the reporting? Is the rock cachedir still
in use after the reconfigure / is there a way to check if it is still
in use?

Thank you
Alex


[squid-users] Re: SMP vs Single Process Performance

2013-03-18 Thread babajaga
Use different, non-overlapping minimal/maximal object-sizes for the
rock/aufs-cache dirs. 
That forces usage of one dir or the other. But: Needs newest sources,
because there were some bugs in that area !
Should imply different  workers, then. 
The gurus will correct me, if wrong :-)






--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/SMP-vs-Single-Process-Performance-tp4659034p4659041.html
Sent from the Squid - Users mailing list archive at Nabble.com.


[squid-users] Re: rock squid -k reconfigure

2013-03-18 Thread babajaga
Any strange Could not open FD ... in your cache.log ?

Your report somehow sounds familar to me , and to be the consequence of
another bug. Not shure, though.



--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/rock-squid-k-reconfigure-tp4659040p4659042.html
Sent from the Squid - Users mailing list archive at Nabble.com.


[squid-users] Re: rock squid -k reconfigure

2013-03-18 Thread babajaga
Just activated full debug and no message  about squid not being able 
to open FD.

Sorry, I mixed it up with another bug, I found.
Look at my corrected response above.



--
View this message in context: 
http://squid-web-proxy-cache.1019090.n4.nabble.com/rock-squid-k-reconfigure-tp4659040p4659044.html
Sent from the Squid - Users mailing list archive at Nabble.com.


Re: [squid-users] RE: ACL wildcard?

2013-03-18 Thread Amos Jeffries

On 19/03/2013 3:03 a.m., Sébastien WENSKE wrote:

Hey,

It would be great if this feature becomes available !!!


Then please submit a Feature Request bug.



acl aclname_1 type_1
acl aclname_2 type_2
acl aclname_3 type_3
acl aclname_4 type_4
[...]
http_access allow|deny aclname_*

Cheers!

-Message d'origine-
De : Nick Cairncross

Hi all,

Just a quick question today..: In a bid to keep to some standards my ACLs
all follow similar naming conventions :

FILETYPE_EXE_[object] - e.g. FILE_TYPE_EXE_Users, FILE_TYPE_EXE_Hosts,
FILE_TYPE_EXE_IPAddresses FILETYPE_MP3_[object] - e.g. FILE_TYPE_MP3_Users,
FILE_TYPE_MP3_Hosts, FILE_TYPE_MP3_IPAddresses FILETYPE_ZIP_[object] - e.g.
FILE_TYPE_ZIP_Users, FILE_TYPE_ZIP_Hosts, FILE_TYPE_ZIP_IPAddresses

Instead of repeating the deny_info entry three times for each of these, is
it possible to use a wildcard for one? If so.. What is it?:

deny_info CUSTOM_FILEBLOCKED FILETYPE_{wildcard}


Have you considred making this a dynamic external_acl_type helper lookup?
The helper can return a message=blah parameter to be embeded in a 
single error page which contains your variable explanation part.


Amos


Re: [squid-users] Re: SMP vs Single Process Performance

2013-03-18 Thread Amos Jeffries

On 19/03/2013 2:17 a.m., Giorgi Tepnadze wrote:

On 18.03.2013 14:06, babajaga wrote:

So you might consider having multiple workers for rock-dir, but only 
one for

the larger stuff, stored using one single HUGE aufs.


Hello, Very interesting recommendations for me also. One question how 
to configure squid for situation described above by you?
Let me describe this situation as I see it, requests for big objects 
32KB go to one worker with aufs and other requests with smaller 
objects go to another worker with rock.
So can squid select different workers by looking at 
HTTPContent-Lengthresponse header?


No. The worker is selected at TCP SYN time before anything like whether 
it is even HTTP arriving are known.


What you gain from multiple workers is the shared rock cache for 32KB 
objects, the AUFS currently still operates like each worker is a 
separate sibling and will duplicate. That said, under-32KB objects form 
the majority of Internet traffic so the gains are not exactly small.


Amos


Re: [squid-users] Re: SMP vs Single Process Performance

2013-03-18 Thread Amos Jeffries

On 18/03/2013 11:06 p.m., babajaga wrote:

Some aspects:
- You are only using aufs. I consider it good for larger objects, but
regarding small ones (=32kb), rock should be faster. So I suggest some
splitting of cache_dirs based on obj size.
- Be careful when setting up the filesystem for your cache_dirs on disk. I
made the experience, that this will have a huge impact on performance. I
consider HDDs reliable, so I take the risk of losing some cache content in
case of diskfailure (which happend very seldom to me) and use an ext4-fs,
stripped down to the very basics (no journal, timestamps etc.).
-AFAIK, SMP does not do shared aufs. That means, in your config you take the
risk of having the same file cached multiple times, in different cache_dirs.
So you might consider having multiple workers for rock-dir, but only one for
the larger stuff, stored using one single HUGE aufs. However, will need
configure opt
'--enable-async-io=128'


Maybe yes, maybe no. Your mileage using it *will* vary a lot.

* Querying just one cache_dir is no faster or slower than querying 
multiple, since they all use a memory index.
* Remember that Squid UFS filesystem has maximum of 2^27 or so objects 
per cache_dir, single huge TB dir cannot hold more count of ojects than 
a tiny MB one. You *will* need to setup a high min object size limit on 
the cache_dir line to fill a very big cache - with other cache_dir for 
the smaller objects.
* If you are using RAID to achieve the large disk size, it is not worth 
it. Squid performs an equivalent to RAID spreading objects across 
directories on its own and the extra RAID operations are just a drag on 
performance, no matter how small. see the wiki FAQ page n RAID for more 
details on that.
* and finally, you also may not need such a huge disk cache, may not be 
able to use one due to RAM limits on the worker - that memory index uses 
~1MB RAM per GB of _total_ disk space across all cache_dir on the worker.




- To smooth the access from clients, you might consider using delay-pools,
to limit the risk of some bad guys sucking your bandwidth by having an upper
limit on download spead.


Yes and no. This caps the usage through Squid but operating system QoS 
controls work a lot better than Squid delay pools and can account for 
non-HTTP traffic in their calculation of who is hogging bandwidth. They 
can also re-adjust the allowances far more dynamically for changing 
traffic flows.


/2c
Amos


Re: [squid-users] rock squid -k reconfigure

2013-03-18 Thread Alex Rousskov
On 03/18/2013 09:18 AM, Alexandre Chappaz wrote:
 Hi,
 
 Im am using squid 3.2.8-20130304-r11795 with SMP  a rock dir configured.
 After a fresh start, cachemanager:storedir reports :
 
 by kid5 {
 Store Directory Statistics:
 Store Entries  : 52
 Maximum Swap Size  : 8388608 KB
 Current Store Swap Size: 28176.00 KB
 Current Capacity   : 0.34% used, 99.66% free
 
 Store Directory #0 (rock): /var/cache/squid/mem/
 FS Block Size 1024 Bytes
 
 Maximum Size: 8388608 KB
 Current Size: 28176.00 KB 0.34%
 Maximum entries:262143
 Current entries:   880 0.34%
 Pending operations: 1 out of 0
 Flags:
 } by kid5
 
 
 
 for the rock cache_dir.
 
 
 After a squid -k reconfigure, without any change in the squid.conf,
 the cachemanager is reporting this :
 
 by kid5 {
 Store Directory Statistics:
 Store Entries  : 52
 Maximum Swap Size  : 0 KB
 Current Store Swap Size: 0.00 KB
 Current Capacity   : 0.00% used, 0.00% free
 
 } by kid5
 
 
 
 Is this only a problem with the reporting? Is the rock cachedir still
 in use after the reconfigure / is there a way to check if it is still
 in use?


Please see Bug 3774. It may be related to your problem.

   http://bugs.squid-cache.org/show_bug.cgi?id=3774

Alex.



Re: [squid-users] SMP vs Single Process Performance

2013-03-18 Thread Alex Rousskov
On 03/18/2013 02:46 AM, Sokvantha YOUK wrote:

 I am going to build squid to supports 1 Gbps traffic on Centos 6 x64
 bits. I have done some studying on getting squid to support this high
 load traffic but I still not clear on some points below


 1. Should I use SMP feature with 6 workers support and AUFS file system?

You must use SMP IMO. The exact number of workers will depend on many
factors, but 4-6 workers is a good starting point, provided you have at
least 6-8 true CPU cores (not hyperthreaded ones). Start by making sure
everything works at Gbit speeds without caching (if needed, you may use
smaller artificial delays on the server side of your workload to
approximate caching effects during these initial non-caching tests).


As for caching, your options include:

0) Memory caching only: very fast but small cache size and limited to
32KB response sizes if shared among workers.

1) Rock store only: fast but limited to 32KB.

2) Large Rock: unlimited file sizes on disk and in RAM but requires
using an unofficial Launchpad branch that needs recent trunk updates.

3) Aufs: Use it if you have to cache objects larger than 32KB on disk
and cannot use #2. Aufs is not SMP-aware, so you will be creating a lot
of duplication and waste. Those overheads may or may not offset caching
gains. You can try aufs in combination with small rock (#1 above). Some
report success with this option -- see other responses on this thread.

In all cases except #0, please note that you will need your total disk
speed to match that of cachable traffic. For example, if 50% of content
is cachable, you will need enough disk spindles to support about
500Mbit/s worth of disk traffic not counting hits. That is about 5000
responses per second swapping out to disk (assuming 13KByte mean
response size). Depending on your mean disk I/O response time, you may
need dozens of disk spindles to sustain that!

Most likely, you will end up caching less than 1Gbit/sec, which means
you need to limit disk I/O. Rock has options to control disk I/O rate so
that you do not overwhelm your disks (see squid.conf.documented). AUFS
may have something like that too.

Finally, when doing performance testing, do not be fooled by excellent
results during the first two hours of the test. Wait until your disks
become full to see what disk seeks will do to your overall disk throughput.


 2. Or single squid process provide better performance and reliable against 
 SMP?

I doubt it is possible to get Gbit/s performance with a single Squid3
worker while handling typical web traffic but YMMV.


HTH,

Alex.



Re: [squid-users] authenticate access to reverse proxy

2013-03-18 Thread Amos Jeffries

On 19/03/2013 12:57 a.m., James Harper wrote:

Say I have a squid reverse proxy with https enabled on it at 
https://apps.example.com. This serves a number of apps including:

/owa - outlook web access
/rpc - ms terminal server gateway
/intranet
/bugtracker
/svn - svn anon browser access
/procedures

These are spread across a bunch of completely different servers (some linux, 
some windows) and works really really well. It has been decided that some of 
the individual applications are not secure enough. /owa, /rpc, and /bugtracker 
are fine, while /intranet,  /procedures, and /svn are not. I have set up acls 
to deny external access to the insecure apps but now want to put some front end 
security on them such that when a user first tries to access one with a browser 
they are redirected and required to sign in to a web forms based page. The idea 
I have for this is:

. create an sqlite database in /var/run or some other throwaway location


NP: sqlite is know to be terribly slow for this type of thing. You may 
want to reconsider the exact DB type there.



. redirect users using deny_info to the sign in page (php)
. on successful authentication, set a cookie (some random string eg md5 hash of 
username, password, and time) and create a corresponding entry in the database 
then redirect user to original page (only possible with squid 3.2.x I 
believe...)


No. Possible with older Squid as well. Pass the original URL to the 
splash page as a query-string parameter using %s.



. create an external acl helper that is passed in the request header 
corresponding to the cookie, decodes the cookie value from the header, and 
looks up the entry in the database (and maybe timestamp last access). If 
present, report OK
. create a cron job nightly (or hourly or whatever) to delete stale records 
from the database to keep the size reasonable


Why not delete stale entries immediately as the helper locates them as 
being stale in the DB? that speeds up all later fetches which would have 
found it and had to re-test. The number of DB entries is then also never 
more than your current user load at any point - as opposed to the total 
unique loading across the entire day so far.



The cookie here only serves as a lookup into the database, and I believe will 
be supplied by the browser on any user request.


Squid has a bundled session helper which supports both passive and 
active (login) sessions. I suggest taking a good look through its 
documentation and considering whether you can use it instead. Doing so 
will keep all the session criteria private to the server instead of 
using Cookie to send out details an attacker can capture and break in with.

 http://wiki.squid-cache.org/ConfigExamples/Portal/Splash

Amos


RE: [squid-users] authenticate access to reverse proxy

2013-03-18 Thread James Harper


 -Original Message-
 From: Amos Jeffries [mailto:squ...@treenet.co.nz]
 Sent: Tuesday, 19 March 2013 10:35 AM
 To: squid-users@squid-cache.org
 Subject: Re: [squid-users] authenticate access to reverse proxy
 
 On 19/03/2013 12:57 a.m., James Harper wrote:
  Say I have a squid reverse proxy with https enabled on it at
 https://apps.example.com. This serves a number of apps including:
 
  /owa - outlook web access
  /rpc - ms terminal server gateway
  /intranet
  /bugtracker
  /svn - svn anon browser access
  /procedures
 
  These are spread across a bunch of completely different servers (some
 linux, some windows) and works really really well. It has been decided that
 some of the individual applications are not secure enough. /owa, /rpc, and
 /bugtracker are fine, while /intranet,  /procedures, and /svn are not. I have
 set up acls to deny external access to the insecure apps but now want to put
 some front end security on them such that when a user first tries to access
 one with a browser they are redirected and required to sign in to a web
 forms based page. The idea I have for this is:
 
  . create an sqlite database in /var/run or some other throwaway location
 
 NP: sqlite is know to be terribly slow for this type of thing. You may
 want to reconsider the exact DB type there.
 

Noted. I've used sqlite3 for lightweight tasks but I'll look around. Any 
suggestions?

  . redirect users using deny_info to the sign in page (php)
  . on successful authentication, set a cookie (some random string eg md5
 hash of username, password, and time) and create a corresponding entry in
 the database then redirect user to original page (only possible with squid
 3.2.x I believe...)
 
 No. Possible with older Squid as well. Pass the original URL to the
 splash page as a query-string parameter using %s.

Good to know!

  . create an external acl helper that is passed in the request header
 corresponding to the cookie, decodes the cookie value from the header, and
 looks up the entry in the database (and maybe timestamp last access). If
 present, report OK
  . create a cron job nightly (or hourly or whatever) to delete stale records
 from the database to keep the size reasonable
 
 Why not delete stale entries immediately as the helper locates them as
 being stale in the DB? that speeds up all later fetches which would have
 found it and had to re-test. The number of DB entries is then also never
 more than your current user load at any point - as opposed to the total
 unique loading across the entire day so far.

I'd need to benchmark this. Doing a 'DELETE FROM sometable WHERE timestamp  
@cutoff' frequently may hurt more than the extra entries hurt a select. I can 
add an index but that hurts inserts...

 
  The cookie here only serves as a lookup into the database, and I believe 
  will
 be supplied by the browser on any user request.
 
 Squid has a bundled session helper which supports both passive and
 active (login) sessions. I suggest taking a good look through its
 documentation and considering whether you can use it instead. Doing so
 will keep all the session criteria private to the server instead of
 using Cookie to send out details an attacker can capture and break in with.
   http://wiki.squid-cache.org/ConfigExamples/Portal/Splash
 

I had studied that page before posting this and came to conclusion that I 
couldn't use it, but maybe that's incorrect. I can't use regular http 
authentication because the underlying apps use it, which I thought precluded 
the use of the login flag. My setup is effectively that the reverse proxy is a 
transparent proxy server. I can't use IP address because there is no guarantee 
that a single user will retain the same IP address across a session (users are 
mobile and can't guarantee a 3G session stays up and keeps same IP address), 
and can't guarantee that there is only one user behind a single IP address.

Also, I couldn't see how to only engage the session helper only once the user 
had successfully authenticated to my forms page, but maybe more study is 
required?

Thanks

James

James


Re: [squid-users] authenticate access to reverse proxy

2013-03-18 Thread Amos Jeffries

On 19/03/2013 4:10 p.m., James Harper wrote:



-Original Message-
From: Amos Jeffries [mailto:squ...@treenet.co.nz]
Sent: Tuesday, 19 March 2013 10:35 AM
To: squid-users@squid-cache.org
Subject: Re: [squid-users] authenticate access to reverse proxy

On 19/03/2013 12:57 a.m., James Harper wrote:

Say I have a squid reverse proxy with https enabled on it at

https://apps.example.com. This serves a number of apps including:

/owa - outlook web access
/rpc - ms terminal server gateway
/intranet
/bugtracker
/svn - svn anon browser access
/procedures

These are spread across a bunch of completely different servers (some

linux, some windows) and works really really well. It has been decided that
some of the individual applications are not secure enough. /owa, /rpc, and
/bugtracker are fine, while /intranet,  /procedures, and /svn are not. I have
set up acls to deny external access to the insecure apps but now want to put
some front end security on them such that when a user first tries to access
one with a browser they are redirected and required to sign in to a web
forms based page. The idea I have for this is:

. create an sqlite database in /var/run or some other throwaway location

NP: sqlite is know to be terribly slow for this type of thing. You may
want to reconsider the exact DB type there.


Noted. I've used sqlite3 for lightweight tasks but I'll look around. Any 
suggestions?


. redirect users using deny_info to the sign in page (php)
. on successful authentication, set a cookie (some random string eg md5

hash of username, password, and time) and create a corresponding entry in
the database then redirect user to original page (only possible with squid
3.2.x I believe...)

No. Possible with older Squid as well. Pass the original URL to the
splash page as a query-string parameter using %s.

Good to know!


. create an external acl helper that is passed in the request header

corresponding to the cookie, decodes the cookie value from the header, and
looks up the entry in the database (and maybe timestamp last access). If
present, report OK

. create a cron job nightly (or hourly or whatever) to delete stale records

from the database to keep the size reasonable

Why not delete stale entries immediately as the helper locates them as
being stale in the DB? that speeds up all later fetches which would have
found it and had to re-test. The number of DB entries is then also never
more than your current user load at any point - as opposed to the total
unique loading across the entire day so far.

I'd need to benchmark this. Doing a 'DELETE FROM sometable WHERE timestamp  
@cutoff' frequently may hurt more than the extra entries hurt a select. I can add 
an index but that hurts inserts...


The cookie here only serves as a lookup into the database, and I believe will

be supplied by the browser on any user request.

Squid has a bundled session helper which supports both passive and
active (login) sessions. I suggest taking a good look through its
documentation and considering whether you can use it instead. Doing so
will keep all the session criteria private to the server instead of
using Cookie to send out details an attacker can capture and break in with.
   http://wiki.squid-cache.org/ConfigExamples/Portal/Splash


I had studied that page before posting this and came to conclusion that I 
couldn't use it, but maybe that's incorrect. I can't use regular http 
authentication because the underlying apps use it, which I thought precluded 
the use of the login flag. My setup is effectively that the reverse proxy is a 
transparent proxy server. I can't use IP address because there is no guarantee 
that a single user will retain the same IP address across a session (users are 
mobile and can't guarantee a 3G session stays up and keeps same IP address), 
and can't guarantee that there is only one user behind a single IP address.

Also, I couldn't see how to only engage the session helper only once the user 
had successfully authenticated to my forms page, but maybe more study is 
required?


The LOGIN flag is just an internal token Squid sends to the helper to 
generate a sessio for the format key. The session itself needs to be 
generated form details normally sent by the client, so that other 
requests can look it up without anything special existing. Although if 
you wanted to you could have the client add a special header
 NP: Cookie is bad choice to use here because the header gets 
'polluted' with other cookies unrelated to your system (ie the actual 
website cookies or advert cookies) - which will break the lookup if you 
use it.


Amos