I realise that this is probably an often posted topic, and I have read a lot about blocking file downloads with squid but my research has only raised more questions than answers. I'm hoping someone can help.
My goal is to block downloads of executables and other 'undesirable' file types such as audio and video files. However, 'normal' browsing should not be interrupted and so files such as MS Word, Excel and PDFs should be unaffected. At this point I know this: 1. How to block ftp downloads completely. This is relatively easy by blocking the FTP protocol, ftp port and URLs with FTP in them. (It is http downloads I'm having problems with.) 2. It is possible to limit file sizes using reply_body_max_size. However this will affect viewing of PDFs and similar documents greater than the given size. 3. It is possible to filter on mime type. However, there are so many mimes type it would be impractical to devise a list of acceptable or unacceptable types. Allowing only text/html or text/plain will not work. 4. It is possible to filter on file extensions using regular expressions, e.g: acl rejected_extensions url_regex -i (https?://)(\w*.+/*) (\.wav|\.mov|\.mpeg.|\.mp.|\.avi|\.rm.|\.rar|\.wm. |\.divx|\.cda|\.midi|\.iso)(.*) acl reject_extensions urlpath_regex -i (\.wav|\.mov|\.mpeg.|\.mp. |\.avi|\.rm.|\.rar|\.wm.|\.divx|\.cda|\.midi|\.iso|\.pls) The problem I've found with this is, again, that the list is too vast to be practical. How many executable file types are there? 5. One suggestion to the above problem is to filter on acceptable file types rather than unacceptable types. E.g: acl accepted_extensions url_regex -i (https?://)(\w*.+/*)(\..htm. |\.htm|\.css|\.srf|\.xml|\.nsf|\.asp|\.asp.|\.pl|\.cgi|\.php. |\.jsp|\.gif|\.jpg|\.jpeg|\.swf|\.png|\.bmp|\.pdf|\.txt|\.doc|\.xls|\.ppt) (.*) acl accept_extensions urlpath_regex -i (\..htm. |\.htm|\.xml|\.nsf|\.asp|\.asp.|\.pl|\.cgi|\.php. |\.jsp|\.gif|\.jpg|\.jpeg|\.swf|\.png|\.bmp|\.pdf|\.txt|\.doc|\.xls|\.ppt) However, both of these acls assume that the URL will contain a .file extension. But obviously, not all URLs do. Take a URL which refers to a directory with a default file such as http://www.microsoft.com/security/. That file does not appear in the URL and so will not pass with rule. So I came up with the following: acl accept_directory_url url_regex -i (https?://)(\w*.+/$) This works in conjunction with acl accepted_extensions url_regex but these still block URLs with no file type and no terminating /, e.g: http://www.google.com/search?hl=en&ie=UTF-8&oe=UTF-8&q=test Any regex rule I come up with to deal with these URLs seems to be a catch all rule which will allow everything through. Doh! So is it actually possible to do what I want using squid? Am I asking too much? Should I be looking at a commercial solution rather than sticking with squid, or there an add-on I can use with squid to achieve my aims (I'm already using squidGuard as well)? Ash Anderson MCP, MCSA, A+. **************************************************************************** ActivityBase 5.1 is now available, please contact your local customer support manager to schedule an upgrade E-mail [EMAIL PROTECTED] for more information and read more at www.id-bs.com ***************************************************************************** The information contained in this email may contain confidential or legally privileged information. If you are not the intended recipient any disclosure, copying, distribution or taking any action on the contents of this information may be unlawful. If you have received this email in error, please delete it from your system and notify us immediately. Any views expressed in this message are those of the individual sender, except where the message states otherwise. IDBS takes no responsibility for any computer virus which might be transferred by way of this email and recommends that you subject any incoming E-mail to your own virus checking procedures. We may monitor all E-mail communication through our networks. If you contact us by E-mail, we may store your name and address to facilitate communication. **********************************************************************