RE: [squid-users] How to not cache a site?
It took me sometime to find this email until a friend pointed it out. You brought up the point that we have been trying to solve by not caching the site. Here is the full story. There is a site behind our Reverse Proxy that keeps on getting funky due to missing icons and some pages that does not follow the formatting. Looking into this issue, we realized that even if we configured our HTTP Headers not to cache this site, I still find instances of that site in the cache during purging. This is what started this post. We have compiled our Apache to use mod_auth_session to assist in security of the site. What you have provided below seems to have pointed us to resolving this issue. The redirection just has a Cache-Control: max-age=0, which allows the cache to store the response, and just requires that it be revalidated (which is done as evidenced by the TCP_REFRESH_HIT in the Squid log). It seems that when the responses got stored, the authentication gets funky and in effect some objects referenced in the page cannot be access. We will modify our mod_auth_session code to avoid this condition. If neither the recoding of mod_auth_session nor adding the cache deny directives does work, I will update this post. Thank you so much, Chris. Regards, Jerome -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 10, 2008 1:12 PM To: squid-users@squid-cache.org Subject: Re: [squid-users] How to not cache a site? Jerome Yanga wrote: Resending as I had received a failure notice message. I do not think that the refresh_pattern is even setup as they are all commented out. # grep refresh_pattern /etc/squid/squid.conf # refresh_pattern regex min percent max #refresh_pattern -i \.js$ 0 0% 1 #refresh_pattern -i \.css$ 0 10% 30 #refresh_pattern . 0 20% 4320 Attached is a zipped http header log captured using Live HTTP Headers. Regards, Jerome Sample squid log entry from the zip file (without cookies) for reference: TCP_REFRESH_HIT:FIRST_UP_PARENT 10.11.12.13 10.10.10.10 - - [06/Jun/2008:21:42:52 +] GET http://site_address.com/help/chr_ind_on.gif HTTP/1.1 302 830 http://site_address.com/help/whskin_tbars.htm; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 There were no associated HTTP headers for this object (http://site_address.com/help/chr_ind_on.gif)*, but here is another request that also resulted in a 302 (Moved Temporarily): GET /help/chr_back.gif HTTP/1.1 Host: site_address.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://site_address.com/help/whskin_tbars.htm Cookie: [removed] HTTP/1.x 302 Moved Temporarily Date: Thu, 05 Jun 2008 23:40:54 GMT Location: http://site_address.com/gateway/index.cfm?fa=loginreturnURL=http%3A%2F% 2Fsite_address%2Ecom%2Fhelp%2FFchr%5Fback%2Egif Cache-Control: max-age=0 Expires: Thu, 05 Jun 2008 23:40:54 GMT Content-Length: 422 Content-Type: text/html; charset=iso-8859-1 Connection: keep-alive The redirection just has a Cache-Control: max-age=0, which allows the cache to store the response, and just requires that it be revalidated (which is done as evidenced by the TCP_REFRESH_HIT in the Squid log). So, I'm still not seeing anything being cached against the server's request. Try tailing the access log and grep for 200 and HIT** (note the spaces on either end of the 200). That should show any objects (as opposed to redirects or errors) that are served from cache. Chris * The other URL (http://site_address.com/help/whskin_tbars.htm) it the referrer. ** tail -f /cache/logs/access.log | egrep 10.10.10.10.* 200 .*HIT
Re: [squid-users] How to not cache a site?
Jerome Yanga wrote: Resending as I had received a failure notice message. I do not think that the refresh_pattern is even setup as they are all commented out. # grep refresh_pattern /etc/squid/squid.conf # refresh_pattern regex min percent max #refresh_pattern -i \.js$ 0 0% 1 #refresh_pattern -i \.css$ 0 10% 30 #refresh_pattern . 0 20% 4320 Attached is a zipped http header log captured using Live HTTP Headers. Regards, Jerome Sample squid log entry from the zip file (without cookies) for reference: TCP_REFRESH_HIT:FIRST_UP_PARENT 10.11.12.13 10.10.10.10 - - [06/Jun/2008:21:42:52 +] GET http://site_address.com/help/chr_ind_on.gif HTTP/1.1 302 830 http://site_address.com/help/whskin_tbars.htm; Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 There were no associated HTTP headers for this object (http://site_address.com/help/chr_ind_on.gif)*, but here is another request that also resulted in a 302 (Moved Temporarily): GET /help/chr_back.gif HTTP/1.1 Host: site_address.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://site_address.com/help/whskin_tbars.htm Cookie: [removed] HTTP/1.x 302 Moved Temporarily Date: Thu, 05 Jun 2008 23:40:54 GMT Location: http://site_address.com/gateway/index.cfm?fa=loginreturnURL=http%3A%2F%2Fsite_address%2Ecom%2Fhelp%2FFchr%5Fback%2Egif Cache-Control: max-age=0 Expires: Thu, 05 Jun 2008 23:40:54 GMT Content-Length: 422 Content-Type: text/html; charset=iso-8859-1 Connection: keep-alive The redirection just has a Cache-Control: max-age=0, which allows the cache to store the response, and just requires that it be revalidated (which is done as evidenced by the TCP_REFRESH_HIT in the Squid log). So, I'm still not seeing anything being cached against the server's request. Try tailing the access log and grep for 200 and HIT** (note the spaces on either end of the 200). That should show any objects (as opposed to redirects or errors) that are served from cache. Chris * The other URL (http://site_address.com/help/whskin_tbars.htm) it the referrer. ** tail -f /cache/logs/access.log | egrep 10.10.10.10.* 200 .*HIT
RE: [squid-users] How to not cache a site?
Hendrik, Yes. They are objects that I found were cached by Squid. However, it was not suppose to be cached. Regards, Jerome -Original Message- From: Henrik Nordstrom [mailto:[EMAIL PROTECTED] Sent: Friday, June 06, 2008 10:49 PM To: Jerome Yanga Cc: squid-users@squid-cache.org Subject: RE: [squid-users] How to not cache a site? On fre, 2008-06-06 at 15:48 -0700, Jerome Yanga wrote: I believe some do but others don't. I just responded to Chris with the http headers. The captured log is a mere mouse over of an icon in the site. Yes, but is those headers from an object which you found was cached by Squid? Regards Henrik
Re: [squid-users] How to not cache a site?
Jerome Yanga wrote: Thanks for the quick response, Chris. Here are my attempts to answer your questions. :) Using Live HTTP Headers plugin for Firefox. It seems to show that Cache-Control and Pragma settings. http://site_address.com/help/jssamples_start.htm GET /help/jssamples_start.htm HTTP/1.1 Host: site_address.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: CFID=1234567890; CFTOKEN=1234567890; SESSIONID=1234567890; __utma=.1.1.1.1.3; __utmc=1; __utmz=1.1.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmb=1.4.10. 1 HTTP/1.x 200 OK Date: Thu, 05 Jun 2008 23:41:00 GMT Server: Apache Last-Modified: Thu, 05 Jun 2008 09:03:27 GMT Etag: 1-1-1 Accept-Ranges: bytes Content-Type: text/html; charset=UTF-8 Cache-Control: no-store, no-cache, must-revalidate, max-age=0 Expires: Thu, 05 Jun 2008 23:41:00 GMT These two lines (Cache-Control: no-store, and an Expires with the same time as the request) should stop any (compliant) shared cache from caching the content. Have you modified the refresh_pattern in your squid.conf? Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Pragma: no-cache Content-Length: 811 Connection: keep-alive I purge the cache using a purge command. #file /cache/usr/bin/purge /cache/usr/bin/purge: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped ...and the syntax I use is below. #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 -e site_address\.com /var/log/site_address.com_purge.log I grep'ed the log created from the command above and I can find instances of site_address.com being deleted. Hence, it is being cached. Have you checked the headers returned with requests for those objects that are being cached? I have also reviewed the access.log and I found a some TCP_MEM_HIT:NONE, TCP_REFRESH_HIT, TCP_IMS_HIT, TCP_HIT, TCP_REFRESH_MISS. Same story here, have you verified the headers on these objects? Especially the objects that result in TCP_REFRESH_HIT and TCP_IMS_HIT as (I think) those are requests that are being validated with the origin server. I cannot review the store.log as it is disabled. I shall try the syntax you have provided on the next available downtime. acl cacheDenyAclName dstdomain .site_address.com acl otherCacheDenyAclName urlpath_regex ^/help/ cache deny cacheDenyAclName otherCacheDenyAclName Thanks again, Chris. Regards, Jerome Chris
RE: [squid-users] How to not cache a site?
On tor, 2008-06-05 at 17:22 -0700, Jerome Yanga wrote: #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 -e site_address\.com /var/log/site_address.com_purge.log I grep'ed the log created from the command above and I can find instances of site_address.com being deleted. Hence, it is being cached. And you are positively sure those objects do have the mentioned Cache-Control headers? Quite often there is different cache requirements for different kinds of objects. Regards Henrik
RE: [squid-users] How to not cache a site?
Henrik, I believe some do but others don't. I just responded to Chris with the http headers. The captured log is a mere mouse over of an icon in the site. I apologize for my noobness. Regards, Jerome -Original Message- From: Henrik Nordstrom [mailto:[EMAIL PROTECTED] Sent: Friday, June 06, 2008 2:41 PM To: Jerome Yanga Cc: squid-users@squid-cache.org Subject: RE: [squid-users] How to not cache a site? On tor, 2008-06-05 at 17:22 -0700, Jerome Yanga wrote: #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 -e site_address\.com /var/log/site_address.com_purge.log I grep'ed the log created from the command above and I can find instances of site_address.com being deleted. Hence, it is being cached. And you are positively sure those objects do have the mentioned Cache-Control headers? Quite often there is different cache requirements for different kinds of objects. Regards Henrik
RE: [squid-users] How to not cache a site?
On fre, 2008-06-06 at 15:48 -0700, Jerome Yanga wrote: I believe some do but others don't. I just responded to Chris with the http headers. The captured log is a mere mouse over of an icon in the site. Yes, but is those headers from an object which you found was cached by Squid? Regards Henrik
Re: [squid-users] How to not cache a site?
Jerome Yanga wrote: I have the following in my site_address.conf file. ExpiresDefault A0 Header set Cache-Control no-store, no-cache, must-revalidate, max-age=0 Header set Pragma no-cache If these headers are indeed being set, Squid will not cache this content (without some effort). Have you verified that these headers are being sent out? If your site is internet accessible, you can use a hosted version of the cacheability engine (such as http://www.ircache.net/cgi-bin/cacheability.py), or you can download and set up a locally hosted version (http://www.mnot.net/cacheability/download.html), or you can look into the Live HTTP Headers plugin for Firefox (http://livehttpheaders.mozdev.org/). However, this does not seem to work as whenever I perform a purge of the cache, I still see stuff being deleted. How (and why) are you purging the cache? Are you sure the objects you are purging are ones that you have specified not be cached? Have you checked your access.log to see if the requested objects are being served from cache, or the store.log to see if the objects are being cached at all? I have been searching the web and I found the no_cache directive. The no_cache directive was deprecated with the release of Squid 2.6. It has been renamed cache in currently supported versions of Squid. I also found out that this directive is added into the squid.conf. I cannot seem to find proper syntax definition for this directive. I can only find examples which may not work for me. Hence, I am posting a message for the first time. Yes. I am a noob. Please be nice to me. ☺ Nevertheless, given the following information, how do I use this directive? Site: www.site_address.com Assuming you want to go this route (which will only affect your cache, and not ISP caches, or browser caches) to deny caching of the whole site you'd use something like... acl cacheDenyAclName dstdomain .site_address.com Note the leading dot on the domain name. That will match the domain and all subdomains. If I wanted to just make not cache a folder in this site, how would the syntax look? Site Folder: www.site_address.com/help/ acl cacheDenyAclName dstdomain .site_address.com acl otherCacheDenyAclName urlpath_regex ^/help/ cache deny cacheDenyAclName otherCacheDenyAclName Here, I am using a combination of ACLs to reduce the load of using regular expressions. If the host matches, then (and only then) the path is checked. If both match, caching is denied. The following (non-exhaustive) list of URLs would be excluded from caching with this set up... www.site_address.com/help/index.html site_address.com/help/image.gif webmail.site_address.com/help/me/figure/this/out.php Please provide a syntax for each question. By the way, if I am going about “no cache” the wrong way, please also indicate. ☺ Indicated. Make sure your server is sending out the headers you think it is. Thank you in advance. Regards, Jyanga Chris
RE: [squid-users] How to not cache a site?
Thanks for the quick response, Chris. Here are my attempts to answer your questions. :) Using Live HTTP Headers plugin for Firefox. It seems to show that Cache-Control and Pragma settings. http://site_address.com/help/jssamples_start.htm GET /help/jssamples_start.htm HTTP/1.1 Host: site_address.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: CFID=1234567890; CFTOKEN=1234567890; SESSIONID=1234567890; __utma=.1.1.1.1.3; __utmc=1; __utmz=1.1.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utmb=1.4.10. 1 HTTP/1.x 200 OK Date: Thu, 05 Jun 2008 23:41:00 GMT Server: Apache Last-Modified: Thu, 05 Jun 2008 09:03:27 GMT Etag: 1-1-1 Accept-Ranges: bytes Content-Type: text/html; charset=UTF-8 Cache-Control: no-store, no-cache, must-revalidate, max-age=0 Expires: Thu, 05 Jun 2008 23:41:00 GMT Vary: Accept-Encoding,User-Agent Content-Encoding: gzip Pragma: no-cache Content-Length: 811 Connection: keep-alive I purge the cache using a purge command. #file /cache/usr/bin/purge /cache/usr/bin/purge: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, dynamically linked (uses shared libs), not stripped ...and the syntax I use is below. #/cache/usr/bin/purge -n -v -c /etc/squid/cachepurge.conf -p 127.0.0.1:80 -P 1 -e site_address\.com /var/log/site_address.com_purge.log I grep'ed the log created from the command above and I can find instances of site_address.com being deleted. Hence, it is being cached. I have also reviewed the access.log and I found a some TCP_MEM_HIT:NONE, TCP_REFRESH_HIT, TCP_IMS_HIT, TCP_HIT, TCP_REFRESH_MISS. I cannot review the store.log as it is disabled. I shall try the syntax you have provided on the next available downtime. acl cacheDenyAclName dstdomain .site_address.com acl otherCacheDenyAclName urlpath_regex ^/help/ cache deny cacheDenyAclName otherCacheDenyAclName Thanks again, Chris. Regards, Jerome -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Thursday, June 05, 2008 1:45 PM To: squid-users@squid-cache.org Subject: Re: [squid-users] How to not cache a site? Jerome Yanga wrote: I have the following in my site_address.conf file. ExpiresDefault A0 Header set Cache-Control no-store, no-cache, must-revalidate, max-age=0 Header set Pragma no-cache If these headers are indeed being set, Squid will not cache this content (without some effort). Have you verified that these headers are being sent out? If your site is internet accessible, you can use a hosted version of the cacheability engine (such as http://www.ircache.net/cgi-bin/cacheability.py), or you can download and set up a locally hosted version (http://www.mnot.net/cacheability/download.html), or you can look into the Live HTTP Headers plugin for Firefox (http://livehttpheaders.mozdev.org/). However, this does not seem to work as whenever I perform a purge of the cache, I still see stuff being deleted. How (and why) are you purging the cache? Are you sure the objects you are purging are ones that you have specified not be cached? Have you checked your access.log to see if the requested objects are being served from cache, or the store.log to see if the objects are being cached at all? I have been searching the web and I found the no_cache directive. The no_cache directive was deprecated with the release of Squid 2.6. It has been renamed cache in currently supported versions of Squid. I also found out that this directive is added into the squid.conf. I cannot seem to find proper syntax definition for this directive. I can only find examples which may not work for me. Hence, I am posting a message for the first time. Yes. I am a noob. Please be nice to me. ☺ Nevertheless, given the following information, how do I use this directive? Site: www.site_address.com Assuming you want to go this route (which will only affect your cache, and not ISP caches, or browser caches) to deny caching of the whole site you'd use something like... acl cacheDenyAclName dstdomain .site_address.com Note the leading dot on the domain name. That will match the domain and all subdomains. If I wanted to just make not cache a folder in this site, how would the syntax look? Site Folder: www.site_address.com/help/ acl cacheDenyAclName dstdomain .site_address.com acl otherCacheDenyAclName urlpath_regex ^/help/ cache deny cacheDenyAclName otherCacheDenyAclName Here, I am using a combination of ACLs to reduce the load of using regular expressions